Get familiar with Azure Status dashboard, Azure Service Health, Service Health Alerts, Azure Resource Health, and Azure Resource Graph to ensure you're never surprised by a planned or unplanned outage that could affect your Microsoft Azure deployments.

You're familiar with the shared responsibility model of cloud computing, correct? This means that the cloud service provider, for instance Microsoft Azure, provides the "on tap" physical infrastructure and is responsible for its availability and security. You, the customer, are responsible for the services and data you consume in the cloud provider's environment.

Stated another way, the shared responsibility model says that Microsoft Azure is responsible for the security of their cloud and that our responsibility is security inside their cloud.

To uphold their responsibilities, Microsoft needs to perform planned maintenance on its Azure infrastructure. How can you stay ahead of these events and plan against a possible service outage?

In contrast, how can you quickly determine whether an outage you experience in your Azure subscription is part of your or Microsoft's side of the shared cloud responsibility model?

If you've had those questions, then you're in the right place. Let me give you a tour of the three Microsoft health services that provide you those insights.

Azure Status dashboard ^

The Azure Status dashboard (https://status.azure.com) is a public webpage that enables you to review service availability across all Azure regions. Because Azure comprises nearly 200 services, this dashboard can be cumbersome to navigate. I recommend pressing CTRL+F and searching for the service for which you need status information.

Azure Status dashboard

Azure Status dashboard

Azure Status dashboard

If you're an old dog like me, you'll appreciate that you can subscribe to the Azure Status page by using Really Simple Syndication (RSS). In my opinion, the main advantage of the Azure Status dashboard is its easy accessibility. Its main disadvantage is that its display isn't specific to the particular Azure regions you're using. That's where Azure Service Health comes in.

Azure Service Health ^

Azure Service Health is a personalized Azure status dashboard accessible from within the Azure portal.

Azure Service Health blade

Azure Service Health blade

What I love about the Service issues blade (shown in the previous screenshot) is that it represents a filtered status view only of the regions in which you've actually deployed Azure services. You want to see "No service issues found" whenever you visit this page. Note that you can click Health history to view historical advisories that affected your resources.

The Planned maintenance blade, shown in the following screenshot, gives you notice of any Microsoft-side maintenance scheduled that could affect your resources. Note that you can download an incident summary as a PDF file for inclusion in your organizational issue-tracking system.

Planned maintenance notifications

Planned maintenance notifications

Health advisories and Security advisories provide filtered views of Azure bulletins that affect your service health and security hygiene, respectively. For instance, the next screenshot shows a recent security advisory I (and many, many other customers) received regarding Cosmos DB.

Azure security advisory concerning an infamous Cosmos DB vulnerability

Azure security advisory concerning an infamous Cosmos DB vulnerability

The Service Health dashboard is also a convenient place to review whether Microsoft violated its Azure service-level agreements (SLAs) with you as a result of a planned or unplanned outage. You'll then be provided with contact information and an issue number to track.

Service Health alerts ^

On each Active events blade, you'll see a toolbar button to define a corresponding alert. Please don't forget about Azure alerts. By setting a Service Health alert, for example, you and your team no longer have to rely upon good old human memory to remember to check the Service Health dashboard periodically.

Creating an Azure Service Health alert rule

Creating an Azure Service Health alert rule

Azure alerts are tied to action groups, which enable notifications (email, SMS, push, and voice) and code execution (webhook, function, logic app, or automation runbook). Azure alerts and action groups are powerful tools to have in your Azure governance arsenal.

Azure Resource Health ^

The most granular Azure status tool is called Resource Health, which you can find in the Support + Troubleshooting settings section for individual Azure resources.

Azure Resource Health blade

Azure Resource Health blade

The idea with Resource Health is that you can spot (and alert on) times when the Azure management backplane is unable to communicate with your resource for whatever reason.

Azure-side events and interruptions are called "platform events." In contrast, your own (mis)configurations can result in the resource losing its connectivity to Azure; these "non-platform events" are recorded here as well.

Azure Resource Health also lists events that degrade your resource performance even if there has not been a complete outage.

Something I haven't mentioned explicitly thus far, but I think is important, is that Azure Service Health and Resource Health both provide remediation advice in addition to outlining the issue's root cause analysis (RCA) results.

Azure Resource Graph ^

You can query Azure Service Health and Resource Health data by using the Azure Resource Graph. Resource Graph is a fully managed performant database of all your Azure subscription resources. For instance, when you search in the Azure portal, the results you see are provided to you by the Azure Resource Graph.

Azure Resource Graph Explorer is a Resource Graph client available to you in the Azure portal. Use Kusto Query Language (KQL) to define your queries. As an example, the following screenshot answers the question, "What is the current availability state of my Azure virtual machines?"

Resource Graph Explorer

Resource Graph Explorer

Takeaways ^

As always, I'd like to leave you with a number of hand-selected learning resources. I hope you now have a better grasp of how to stay on top of platform- and non-platform-related changes throughout your Azure infrastructure.

Subscribe to 4sysops newsletter!

+6
avataravatar
0 Comments

Leave a reply

Please enclose code in pre tags

Your email address will not be published. Required fields are marked *

*

© 4sysops 2006 - 2021

CONTACT US

Please ask IT administration questions in the forums. Any other messages are welcome.

Sending

Log in with your credentials

or    

Forgot your details?

Create Account