Azure Operational Insights is Microsoft's new cloud-based log management tool that collects data from multiple machines and sources such as Windows event logs, SCOM alerts, update status and anitmalware status and stores them in the cloud for big data analysis to identify the cause of operational issues.
Avatar

Once upon a time, there was System Center Advisor (SCA). SCA’s preview was in 2011, which, in public cloud services, is the equivalent of the Stone Age. The service gathered logs from your servers and then uploaded them to the cloud, where they were analyzed and compared against best practices. Any deviation was flagged on the console (originally a web-based console that was eventually integrated in Systems Center Operations Manager [SCOM]), with links to KB articles for remediation.

But SCA was a fairly limited service. It was only a matter of time before it morphed into something far more interesting: Azure Operational Insights. The SCA functionality is still there, but Operational Insights offers a lot more. You can sign up and use the service, currently in Preview, for free.

The two main drivers of Operational Insights are Big Data—the ability to use machine learning to analyze large quantities of information—and the need for real-time troubleshooting. SCOM is very good at catching known problems (through rules and knowledge contained in its Management Packs), but when you hit a problem not previously known, you have to manually sift through log files to find the cause. If the issue affects multiple servers, you have to correlate information from multiple sources—something that can take a long time. Operational Insights offers an attractive alternative that’s easy to use.

Setting up Operational Insights

There are three ways to connect your servers to Operational Insights. If you have SCOM 2012/2012 R2 with the latest Cumulative Update/Update Rollup Package, you can connect SCOM to the cloud. If you have (a smaller number of) standalone servers, you can install the agent on each of them and connect them to the cloud. Finally, if you have IaaS, PaaS, or storage assets in Azure, you can connect them to Operational Insights for monitoring.

The agent mentioned above is the Microsoft Monitoring Agent (MMA), the new flavor of the SCOM 2012 R2 probe. Up until the 2012 version, the SCOM agent was tied to OM; with 2012 R2, the agent is now standalone and can be connected to other management systems (such as Operational Insights).

Microsoft Monitoring Agent installation

Microsoft Monitoring Agent installation

Note that if you install the MMA agent on a server that already has a SCOM agent installed, it’ll be upgraded and the SCOM configuration will be transferred. You still have to manually configure the agent to talk to Operational Insights as well (through the Control Panel applet).

No ports need to be opened in your firewall. All traffic takes place over port 443, and traffic is SSL encrypted with a self-signed certificate generated automatically on each server where you install the agent. If you use a proxy for Internet connectivity, the agent can be configured to use it. Note that this only enables the agent to communicate; a user can’t use a browser to access the Internet (unless it is also configured with the right proxy and username/password).

The Console

One complaint I hear from many SCOM users is the speed of the console. Microsoft has attempted to improve this over the years but has not had much success. I’m happy to report that the web-based console for Operational Insights is very responsive, even when dealing with large datasets, and it works on Chrome, Safari, and Firefox on iPhone/iPad/PC as well as the browser in Android.

As you start narrowing your search results, you create a breadcrumb navigation that allows you to “go back up” in case your particular focus turns out to be a dead end. The “time distribution” chart on the top left provides another way of zeroing in on a particularly busy period of a specific event.

It’s possible to export the result of any search to Excel. Note that only the first 1,000 rows will exist in the spreadsheet. This is something the product team is seeking feedback on. Of course, it doesn’t make sense to have 2 million rows in an Excel sheet, but 1,000 seems a bit too conservative to me.

Behind the scenes in Azure are two storage clusters for your (and everyone else’s) data: a hot cluster for short-term data and cold storage for long-term data, which takes a little longer to retrieve. Note that each Operational Insights customer’s data is kept in a separate partition.

Azure Operational Insights Windows Phone app

Azure Operational Insights Windows Phone app

A mobile app is also offered—currently only for Windows Phone, but iOS and Android versions are reportedly coming. You create a “My dashboard” on the web console with just the data you’re interested in and save it; this particular dashboard will then show up in the app. Your saved searches are also available in the app, along with the built-in ones. The app looks at the last seven days’ worth of data by default, but you can increase this to 31 days or select a shorter time span.

Azure Insights My dashboard

Azure Insights My dashboard

Intelligence Packs

These are perhaps the most exciting part of Operational Insights and extend the functionality by “massaging” the log data for a particular purpose. Today, six public IPs are available to set up, and I would recommend enabling all of them.

Intelligence pack gallery

Intelligence pack gallery

The Malware Assessment IP uses your data to identify infections and the status of antimalware, while the System Update Assessment IP looks at what KBs are installed and which ones are missing across all your servers (with clickable links to the missing ones). One that’ll show immediate value is the Change Tracking IP, which lets you know about configuration changes (including OS, Exchange, and SQL) and software installations/uninstallations.

Log Management lets you capture, index, search, correlate, and analyze your Windows event logs across all servers, whereas Alert Management gathers all your SCOM alert information for analysis. The Capacity Planning IP relies on System Center Virtual Machine Manager, but, after feedback, Microsoft is looking to remove this dependency; after all, many businesses with smaller Hyper-V implementations don’t use VMM or System Center at all. The SQL Assessment IP, written by Product Support Service (PSS), gives your servers a rating based on best practices for SQL.

Currently in a closed preview, a Security IP collects security event and firewall logs and integrates with data from Microsoft’s digital crime unit. This looks really interesting and might ultimately be able to identify breaches and advanced persistent threats much quicker than today. Finally, there’s an AD Assessment IP (also in private preview) that looks at the health of your AD environment.

Conclusion

Operational Insights is free during its preview period. After that, it looks like there will be a free tier with a 500 MB-per-day limitation on the data and a retention range of only seven days. The Standard tier will have a month’s retention, whereas the Premium tier will offer 12 months. Standard will be $1.15 per GB of data sent to Azure, whereas Premium will be $1.75 per GB. You can read more here.

Comparisons to Splunk (and other log analysis tools) are appropriate, but I think the target audience is different. Whereas Splunk can work with any log file format, Operational Insights is specifically made to “understand” Windows and its server applications data. The IPs are also a typical Microsoft approach to take complicated concepts and difficult log analysis and build them into an easy-to-use tool for ordinary IT pros—not just the one guru who took the time to learn a complicated search language. Another difference is that offering this as a cloud service instead of a locally installed application provides opportunities for cloud-scale Big Data analysis, which would be very difficult to replicate locally if you have a larger environment.

This last point is, of course, the Achilles’ heel of Operational Insights. Many businesses will have reservations about their log data being stored in Azure. Microsoft attempts to alleviate these fears with this document (available on the Preview page, under System Security) as well as this page in the documentation, but I know there will be some heated discussions about this within IT teams.

Another point is whether this service is going to be a replacement for SCOM. Again, I only see that it complements, not replaces, the abilities of SCOM. There’s no real-time alerting, no MPs (IPs aren’t the same thing), and no monitoring of your infrastructure in the sense that OM offers. I think Operational Insights is a great example of how Microsoft is going to complement its on-premises software with cloud services in the future.

Get in on the action while Operational Insights is free. Install the agent on a couple of lab servers and play around; it’s great fun, and I believe that most IT teams should gain worthwhile insights pretty quickly.

0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*

© 4sysops 2006 - 2023

CONTACT US

Please ask IT administration questions in the forums. Any other messages are welcome.

Sending

Log in with your credentials

or    

Forgot your details?

Create Account