Linux provides the comm
command in order to compare two sorted files line by line. The most important thing about using the comm command is the two files should be already sorted. The comm command is first created for the Unix operating system in 1973 and became popular late 1980s. First appeared for the Unix version 4. The comm command provides very similar functionality to the diff command. As a basic tool, it is not updated regularly and versions stay stable for a long time.
comm Command Syntax
comm command work with two files except the input read from the command line. So the syntax of the comm command has two files and options like below.
comm OPTIONS FILE1 FILE2
- OPTIONS are used to change the compare operation like show same and unique lines etc.
- FILE1 is the first file to compare with FILE2.
- FILE2 is the second file to compare with the FILE1.
comm Command Parameters
comm is a simple Linux command which provides few options like below.
PARAMETER | DESCRIPTION |
---|---|
-1 | Don’t display First File Unique Lines |
-2 | Don’t display Second File Unique Lines |
-3 | Display Common Lines (Exist Both Files) |
–check-order | Check if files are sorted properly |
–nocheck-order | Do not check if files are sorted properly |
–output-delimeter | Set delimiter for output |
–total | Display summary information |
-z | Terminate with NUL, not newline |
–help | Print help informaion |
–version | Print version information |
Compare Two Sorted Files
Let’s start with a simple example where we will check two files which are sorted properly. The following example will compare two files and print three columns as output. We will not provide any parameters. We have the following files named file1.txt
and file2.txt
.
aaa linuxtect.com poftut.com pythontect.com windowstect.com wisetut.com
file2.txt content is like below.
linuxtect.com poftut.com pythontect.com windowstect.com wisetut.com www
We will use the following command compare file1.txt and file2.txt .
$ comm file1.txt file2.txt
From the output, we can see that the line “aaa” only exists in the file1.txt, and the line “www” only exists in the file2.txt. All other lines exist in both file1.txt and file2.txt. According to the output provided first file unique lines printed into the first column, the second file unique lines printed into the second column and the common lines are printed into the 3rd column. Also, every column is separated with a Tab.
Don’t Show First Files Unique Lines
We can use the -1
option or parameter in order to hide first file unique lines. This will only print two columns where the first column is second file unique lines and second column is the common or the same lines for both files.
$ comm -1 file1.txt file2.txt
Don’t Show Second Files Unique Lines
We can use the -2
option or parameter in order to hide the second file unique lines. This will only print two columns where the first column is the second file unique lines and the second column is common or the same lines existing in both files.
$ comm -1 file1.txt file2.txt
Show Only Unique Lines
If you need to show only the unique files about both files you can use the -3
option which will print unique lines but do not print the same lines.
$ comm -3 file1.txt file2.txt
Show Same and Unique Line Counts
The --total
option can be used to show lines counts about unique to file1, unique to file2 and same lines in both files.
$ comm --total file1.txt file2.txt
diff vs comm
The diff and comm commands provide very similar functionality which generally causes a comparison. In general, the diff command provides more features. The comm command is best suited to be used in script files for simpler operations and comparisons.