Never trust a statistics you didn’t forge yourself. This piece of unquestionable wisdom came to mind when I recently read about a statistics from Google “showing” that IIS servers host more often malware than Apache web servers. An interesting InfoWorld article discusses the validity of these statistics in the light of another study revealing that about 9,000 sites hosted by IPOWER attempt to install malware on visitors’ computers.

Like most web hosting companies IPOWER works with Apache. Roger A. Grimes, the author of the InfoWorld article, calculated that an average every IPOWER server hosts about 910 virtual web servers.

Therefore, statistics like the one from Google or more prominently the ones from Netcraft, can’t tell you anything interesting about the market shares of IIS and Apache. They only provide information about internet domains, but not web server software. All those private homepage users didn’t choose Apache as their web server software. Most of them think of an Indian tribe when they hear “Apache”, not the software. So these statistics only show that web hosting companies prefer Apache.

Furthermore, you can’t get any other relevant information from the data provided by Google. In particular, you can’t draw any conclusion about the distribution of malware on IIS and Apache web servers. And, of course, it doesn’t tell you what web server software is more secure.

The fact that 66% of all internet domains (not web servers as the Google guys purport) run on Apache, but only 49% of all malicious domains (not web servers) use the Open Source Web server only reflects the differences in their user base, in my view. The typical “Apache user” only has a picture of himself and of his cat kitty on his “web site”. IIS most likely is used more often in corporate environments. Hence, you will find often more complex web applications on IIS systems thereby increasing their attack surface.

I think it is possible to gain more interesting information from the raw data of the Google log files, for example, if you take the IP addresses of the web servers into account. However, it seems that nobody at Google is really interested in this. If you give the same raw data to someone from Microsoft, you would most likely get exactly the opposite results. That’s why I only believe in such statistics if I was the one who forged them ;-)

Leave a Comment | Subscribe RSS | Newsletter |