- Azure Sentinel—A real-world example - Tue, Oct 12 2021
- Deploying Windows Hello for Business - Wed, Aug 4 2021
- Azure Purview: Data governance for on-premises, multicloud, and SaaS data - Wed, Feb 17 2021
And this was true in the early days of the cloud. Trying to get a network packet capture to troubleshoot a problem was nearly impossible. But that was a long time ago (in cloud years), and now several tools can help you out, although they're not always easy to find. In this article, we'll look at what's available and what's coming in Azure for network monitoring, alerting, and insights. In the process, we'll take away one more reason for shunning the public cloud.
Why monitor
We often bring out network monitoring tools only when trouble is afoot. However, we should really use them for security and performance insights on an ongoing basis as well as measuring availability of applications. Analyzing your normal network traffic makes it a whole lot easier to spot anomalies.
In this new cloudy world, it's also fair to say that many sysops people leave the networking to the "Cisco guy" on premises. They find themselves doing a lot more networking in public clouds where the delineation between the two is blurrier.
Azure offers Network Performance Monitor (NPM), DNS Analytics, Network Security Group (NSG) Log Analytics, and App Gateway Analytics. More recently, they've gathered disparate tools together under the Network Watcher umbrella.
Network Performance Monitor
NPM is all about latency and loss for your hybrid network. It doesn't rely on SNMP (and thus is not bound to particular routers or switches). It instead opts for an agent on your local network as well as an agent in your virtual network (vNet). NPM offers the choice of ICMP (which ping uses) or TCP/IP. Inventory your environment before deciding: TCP/IP more closely mimics your normal network traffic, and many firewalls drop ICMP by default. An EnableRules.ps1 PowerShell script lets you open the required port (by default 8084). It also sets registry keys and Windows Firewall rules for TCP/IP. For ICMP, we need to create six rules in Windows Firewall (see the link above).
It gives you a hop-by-hop view of your network segments and also provides bandwidth usage and alerting. Microsoft recently released NPM February 15, 2018 in general availability (GA) for ExpressRoute monitoring. It offers features such as the network state recorder, letting you go back to a picture of the network at a particular point in time to make it easier to track down intermittent, transient issues. For ExpressRoute monitoring, only TCP/IP is available.
Operations Management Suite (OMS) displays, slices, and dices the data. You can use NPM in three modes: for performance monitoring for any generic network segments (including on-premises locations, Azure and other clouds), for service endpoint monitoring, or for ExpressRoute monitoring. As the name suggests, service endpoint monitoring gives you a view of connectivity to SaaS and PaaS services, websites, and SQL databases. It has built-in tests for Office and Dynamics 365.
All three modes provide a topology view showing networks connections and drill-downs for latency. The ExpressRoute monitoring also offers peering and circuit views. It updates data every 3 seconds (with a resolution of 100 ms) and uploads it to the cloud every 3 minutes.
DNS Analytics
This (preview) service is also part of OMS/Azure Log Analytics. It gathers DNS-related information from Windows agents (not from the Linux agent yet) and from System Center Operations Manager (SCOM) connected management groups.
It lists clients attempting to resolve malicious domains (that malware command and control services often use). It analyzes the domains your clients most frequently query. These are the chatty clients (most clients will not make hundreds of thousands of DNS requests—if they do it could indicate malicious activity) and dynamic DNS registration failures.
As with NPM, you can also use the powerful query language of Log Analytics to slice the underlying log data to find exactly what you're looking for.
Network Watcher
What NPM does for your WAN links, Network Watcher does for your Azure resources. It provides topology views, filtered packet captures, IP Flow verify, NSG view and flow logging, vNet gateway troubleshooting, and connection troubleshooting. You can also compare the number of vNets, NSGs, public IP addresses, and load balancers against the limits of your subscription.
To enable Network Watcher, go to All Services – Networking – Network Watcher and enable it for all regions (or only specific regions) and subscriptions. After you enable it, the left-hand menu lights up with options.
Packet capture uses the Windows or Linux VM extension Watcher agent. After you've defined a storage account to store the captures in, it lets you filter based on TCP or UDP, source and destination IP or port (5 tuple). Use Storage Explorer to download the capture for analysis. You can also create an alert to trigger the capture based on a condition in the VM.
IP Flow verify lets you pick a VM, NIC, protocol, port, and traffic direction. It'll then analyze connectivity to an external IP address and show you whether the network allows or blocks the traffic and which NSG rules allow or block it.
Topology shows you vNets, subnets, NICs, VMs, NSGs, and IP addresses and their relationships.
NSG flow logs centralize the storage of NSG logs in a storage account. You can set a retention policy from 1–365 days for the data—if not, the system will keep it forever. It aggregates logs by eliminating redundant data. It then enriches the logs by adding geographic information for IP addresses, identifying Azure addresses, mapping IPs to vNets, identifying known malicious IP addresses, and finally building a topology map. Traffic Analytics (preview released March 6, 2018) builds on the NSG data and presents it several useful views.
If you have a vNet gateway for site-to-site VPN connections, you can use this part of Network Watcher to identify problems with connectivity.
Deploy Application Gateway—a layer-7 application delivery controller virtual appliance with SSL offloading and a built-in Web Application Firewall (WAF). With it, you can use Azure Log Analytics (OMS) to monitor for client and server errors, requests per hour, failed requests per hour, and errors by user agent.
Going global
If you have users accessing your Azure services from all over the world, you're likely using Traffic Manager to route them using DNS to the closest Azure region where you host your application. Released March 19, 2018, it now has an added option called Traffic View. It lets you see where your clients are coming from and their latency in reaching your service, presented in a geographical map.
Many options let you understand what's going on in your Azure or hybrid networks. As you have seen, Azure has many services that can help you, but they're not always easy to find. Hopefully, armed with the information in this article, that'll be a bit easier.
Subscribe to 4sysops newsletter!
In my next post I will discuss Azure Load Balancer.