- Allow non-admins to access Remote Desktop - Thu, Sep 28 2023
- Which WSUS products to select for Windows 11? - Tue, Sep 26 2023
- Activate BitLocker with manage-bde, PowerShell, or WMI - Wed, Sep 20 2023
Comparing files is a common task for system administrators. Hence, Windows always offered command-line tools for this purpose. I am talking about comp.exe and fc.exe, two utilities that you shouldn’t write off in a time where PowerShell dominates the command prompt. In particular, fc.exe offers more options than the cmdlet Compare-Object does, so you will want to use it in special cases (such as when you want to display line numbers).
You will often realize that two files are duplicates, and you might then want to delete one of them. In such a scenario, you don’t have to examine the files for possible differences; you just want to know whether the files are identical or not. With big or binary files, you should calculate a hash value because doing so is faster than comparing the file contents. PowerShell offers the Get-FileHash cmdlet for this task. All you have to do is check whether the results are the same or not.
((Get-FileHash ".\file1.xml").hash) -eq ((Get-FileHash ".\file2.xml").hash)
This example calls Get-FileHash with the two file names and compares the hash properties with the -eq operator. If the files are identical, the result is True; otherwise, it is False.
Comparing files with Get-FileHash
Compare-Object doesn’t read file contents
If you also want to see the differences in detail, you have to work with Compare-Object (alias compare). As its name suggests, it not only compares files but also all kinds of objects. Correspondingly, it doesn’t allow you to read files to compare them. Instead, you have to extract their content yourself and pass it to Compare-Object. Get-Content is a suitable cmdlet for the task:
Compare -ReferenceObject (Get-Content ".\file1.xml") -DifferenceObject (Get-Content ".\file2.xml")
Inverse comparison, uppercase and lowercase
You can modify the comparison with Compare-Object with a couple of parameters. By default, the cmdlet is case insensitive, but with the parameter -CaseSensitive you can change this behavior.
Comparing files with Compare-Object
If you add -IncludeEqual, all lines of both files will be listed and the so-called SideIndicator shows whether they are equal or whether one of them only appears in the first or in the second file. If you also use -ExcludeDifferent, you will only get those lines that are identical in both files.
Modifying the contents with regex
You can compensate the relatively meager options of Compare-Object by processing the contents of the files before the comparison with the help of replace operators and regular expressions. For instance, you could replicate the ability of fc.exe to compress tabs and spaces like this:
compare ((Get-Content ".\file1.xml") -replace "(\s)+",'$1') ((Get-Content ".\file2.xml") -replace "(\s)+",'$1')
In this example, all whitespace characters (represented by \s) that appear multiple times, one after the other, are replaced with one instance. For complex regex expressions, you could store the results of both calls of Get-Content in a variable and pass it to Compare-Object. This would improve the clarity of the code.
Of course, comparing hashes is the slowest possible way of checking two files – no matter what, both files are going to be read *in their entirety* 🙁
I’d suggest hashing only if you have to know for sure whether a file’s integrity has been compromised. To compare files skip straight to one of the comparison tools.