PowerShell offers multiple options to test whether files are identical or not. If you only need this kind of information, you can use Get-FileHash. If you also want to know in what way the files differ, Compare-Object is the tool of choice.

Comparing files is a common task for system administrators. Hence, Windows always offered command-line tools for this purpose. I am talking about comp.exe and fc.exe, two utilities that you shouldn’t write off in a time where PowerShell dominates the command prompt. In particular, fc.exe offers more options than the cmdlet Compare-Object does, so you will want to use it in special cases (such as when you want to display line numbers).

You will often realize that two files are duplicates, and you might then want to delete one of them. In such a scenario, you don’t have to examine the files for possible differences; you just want to know whether the files are identical or not. With big or binary files, you should calculate a hash value because doing so is faster than comparing the file contents. PowerShell offers the Get-FileHash cmdlet for this task. All you have to do is check whether the results are the same or not.

((Get-FileHash ".\file1.xml").hash) -eq ((Get-FileHash ".\file2.xml").hash)

This example calls Get-FileHash with the two file names and compares the hash properties with the -eq operator. If the files are identical, the result is True; otherwise, it is False.

Comparing files with Get-FileHash

Comparing files with Get-FileHash

Compare-Object doesn’t read file contents

If you also want to see the differences in detail, you have to work with Compare-Object (alias compare). As its name suggests, it not only compares files but also all kinds of objects. Correspondingly, it doesn’t allow you to read files to compare them. Instead, you have to extract their content yourself and pass it to Compare-Object. Get-Content is a suitable cmdlet for the task:

Compare -ReferenceObject (Get-Content ".\file1.xml") -DifferenceObject (Get-Content ".\file2.xml")

­Inverse comparison, uppercase and lowercase

You can modify the comparison with Compare-Object with a couple of parameters. By default, the cmdlet is case insensitive, but with the parameter -CaseSensitive you can change this behavior.

Comparing files with Compare-Object

Comparing files with Compare-Object

If you add -IncludeEqual, all lines of both files will be listed and the so-called SideIndicator shows whether they are equal or whether one of them only appears in the first or in the second file. If you also use -ExcludeDifferent, you will only get those lines that are identical in both files.

Modifying the contents with regex

You can compensate the relatively meager options of Compare-Object by processing the contents of the files before the comparison with the help of replace operators and regular expressions. For instance, you could replicate the ability of fc.exe to compress tabs and spaces like this:

compare ((Get-Content ".\file1.xml") -replace "(\s)+",'$1') ((Get-Content ".\file2.xml") -replace "(\s)+",'$1')

In this example, all whitespace characters (represented by \s) that appear multiple times, one after the other, are replaced with one instance. For complex regex expressions, you could store the results of both calls of Get-Content in a variable and pass it to Compare-Object. This would improve the clarity of the code.

1 Comment
  1. Chris Bedford 7 years ago

    Of course, comparing hashes is the slowest possible way of checking two files – no matter what, both files are going to be read *in their entirety* 🙁

    I’d suggest hashing only if you have to know for sure whether a file’s integrity has been compromised. To compare files skip straight to one of the comparison tools.

Leave a reply

Your email address will not be published. Required fields are marked *

*

© 4sysops 2006 - 2023

CONTACT US

Please ask IT administration questions in the forums. Any other messages are welcome.

Sending

Log in with your credentials

or    

Forgot your details?

Create Account