One of the biggest issues that many IT Professionals have with public cloud is the loss of control. "I will be blamed if this service isn't working, and if I can't even see why no one can connect to it—forget it! It'll stay in my care where I can see everything." Sound familiar?

And this was true in the early days of the cloud. Trying to get a network packet capture to troubleshoot a problem was nearly impossible. But that was a long time ago (in cloud years), and now several tools can help you out, although they're not always easy to find. In this article, we'll look at what's available and what's coming in Azure for network monitoring, alerting, and insights. In the process, we'll take away one more reason for shunning the public cloud.

Why monitor

We often bring out network monitoring tools only when trouble is afoot. However, we should really use them for security and performance insights on an ongoing basis as well as measuring availability of applications. Analyzing your normal network traffic makes it a whole lot easier to spot anomalies.

In this new cloudy world, it's also fair to say that many sysops people leave the networking to the "Cisco guy" on premises. They find themselves doing a lot more networking in public clouds where the delineation between the two is blurrier.

Azure offers Network Performance Monitor (NPM), DNS Analytics, Network Security Group (NSG) Log Analytics, and App Gateway Analytics. More recently, they've gathered disparate tools together under the Network Watcher umbrella.

Network Performance Monitor

NPM is all about latency and loss for your hybrid network. It doesn't rely on SNMP (and thus is not bound to particular routers or switches). It instead opts for an agent on your local network as well as an agent in your virtual network (vNet). NPM offers the choice of ICMP (which ping uses) or TCP/IP. Inventory your environment before deciding: TCP/IP more closely mimics your normal network traffic, and many firewalls drop ICMP by default. An EnableRules.ps1 PowerShell script lets you open the required port (by default 8084). It also sets registry keys and Windows Firewall rules for TCP/IP. For ICMP, we need to create six rules in Windows Firewall (see the link above).

It gives you a hop-by-hop view of your network segments and also provides bandwidth usage and alerting. Microsoft recently released NPM February 15, 2018 in general availability (GA) for ExpressRoute monitoring. It offers features such as the network state recorder, letting you go back to a picture of the network at a particular point in time to make it easier to track down intermittent, transient issues. For ExpressRoute monitoring, only TCP/IP is available.

Configuration of NPM in OMS

Configuration of NPM in OMS

Operations Management Suite (OMS) displays, slices, and dices the data. You can use NPM in three modes: for performance monitoring for any generic network segments (including on-premises locations, Azure and other clouds), for service endpoint monitoring, or for ExpressRoute monitoring. As the name suggests, service endpoint monitoring gives you a view of connectivity to SaaS and PaaS services, websites, and SQL databases. It has built-in tests for Office and Dynamics 365.

Configuration of service endpoint monitoring in OMS

Configuration of service endpoint monitoring in OMS

All three modes provide a topology view showing networks connections and drill-downs for latency. The ExpressRoute monitoring also offers peering and circuit views. It updates data every 3 seconds (with a resolution of 100 ms) and uploads it to the cloud every 3 minutes.

DNS Analytics

This (preview) service is also part of OMS/Azure Log Analytics. It gathers DNS-related information from Windows agents (not from the Linux agent yet) and from System Center Operations Manager (SCOM) connected management groups.

It lists clients attempting to resolve malicious domains (that malware command and control services often use). It analyzes the domains your clients most frequently query. These are the chatty clients (most clients will not make hundreds of thousands of DNS requests—if they do it could indicate malicious activity) and dynamic DNS registration failures.

As with NPM, you can also use the powerful query language of Log Analytics to slice the underlying log data to find exactly what you're looking for.

Network Watcher

What NPM does for your WAN links, Network Watcher does for your Azure resources. It provides topology views, filtered packet captures, IP Flow verify, NSG view and flow logging, vNet gateway troubleshooting, and connection troubleshooting. You can also compare the number of vNets, NSGs, public IP addresses, and load balancers against the limits of your subscription.

To enable Network Watcher, go to All Services – Networking – Network Watcher and enable it for all regions (or only specific regions) and subscriptions. After you enable it, the left-hand menu lights up with options.

Network Watcher main blade

Network Watcher main blade

Packet capture uses the Windows or Linux VM extension Watcher agent. After you've defined a storage account to store the captures in, it lets you filter based on TCP or UDP, source and destination IP or port (5 tuple). Use Storage Explorer to download the capture for analysis. You can also create an alert to trigger the capture based on a condition in the VM.

IP Flow verify lets you pick a VM, NIC, protocol, port, and traffic direction. It'll then analyze connectivity to an external IP address and show you whether the network allows or blocks the traffic and which NSG rules allow or block it.

Topology shows you vNets, subnets, NICs, VMs, NSGs, and IP addresses and their relationships.

Network Watcher Topology

Network Watcher Topology

NSG flow logs centralize the storage of NSG logs in a storage account. You can set a retention policy from 1–365 days for the data—if not, the system will keep it forever. It aggregates logs by eliminating redundant data. It then enriches the logs by adding geographic information for IP addresses, identifying Azure addresses, mapping IPs to vNets, identifying known malicious IP addresses, and finally building a topology map. Traffic Analytics (preview released March 6, 2018) builds on the NSG data and presents it several useful views.

If you have a vNet gateway for site-to-site VPN connections, you can use this part of Network Watcher to identify problems with connectivity.

Deploy Application Gateway—a layer-7 application delivery controller virtual appliance with SSL offloading and a built-in Web Application Firewall (WAF). With it, you can use Azure Log Analytics (OMS) to monitor for client and server errors, requests per hour, failed requests per hour, and errors by user agent.

Going global

If you have users accessing your Azure services from all over the world, you're likely using Traffic Manager to route them using DNS to the closest Azure region where you host your application. Released March 19, 2018, it now has an added option called Traffic View. It lets you see where your clients are coming from and their latency in reaching your service, presented in a geographical map.

Traffic View in Traffic Monitor

Traffic View in Traffic Monitor

Many options let you understand what's going on in your Azure or hybrid networks. As you have seen, Azure has many services that can help you, but they're not always easy to find. Hopefully, armed with the information in this article, that'll be a bit easier.

Subscribe to 4sysops newsletter!

In my next post I will discuss Azure Load Balancer.


Leave a reply

Your email address will not be published.


© 4sysops 2006 - 2023


Please ask IT administration questions in the forums. Any other messages are welcome.


Log in with your credentials


Forgot your details?

Create Account