Sorting according to keys or columns

We can use a column with sort if the input data is formatted like this:

$ cat data.txt
1  mac    2000
2  winxp    4000
3  bsd    1000
4  linux    1000

We can sort this in many ways; currently it is sorted numerically, by the serial number (the first column). We can also sort by the second or third column.

The -k option specifies the characters to sort by. A single digit specifies the column. The -r option specifies sorting in reverse order. Consider this example:

# Sort reverse by column1
$ sort -nrk 1  data.txt
4  linux    1000 
3  bsd    1000 
2  winxp    4000 
1  mac    2000 
# -nr means numeric and reverse

# Sort by column 2
$ sort -k 2  data.txt
3  bsd    1000 
4  linux    1000 
1  mac    2000 
2  winxp    4000
Always be careful about the -n option for numeric sort. The sort command treats alphabetical sort and numeric sort differently. Hence, in order to specify numeric sort, the -n option should be provided.

When -k is followed by a single integer, it specifies a column in the text file. Columns are separated by space characters. If we need to specify keys as a group of characters (for example, characters 4-5 of column 2), we define the range as two integers separated by a period to define a character position, and join the first and last character positions with a comma:

$ cat data.txt


1 alpha 300
2 beta 200
3 gamma 100
$ sort -bk 2.3,2.4 data.txt   ;# Sort m, p, t
3 gamma 100
1 alpha 300
2 beta 200

The highlighted characters are to be used as numeric keys. To extract them, use their positions in the lines as the key format (in the previous example, they are 2 and 3).

To use the first character as the key, use this:

$ sort -nk 1,1 data.txt

To make the sort's output xargs compatible with the \0 terminator, use this command:

$ sort -z data.txt | xargs -0
# Use zero terminator to make safe use with xargs

Sometimes, the text may contain unnecessary extraneous characters such as spaces. To sort them in dictionary order, ignoring punctuations and folds, use this:

$ sort -bd unsorted.txt

The -b option is used to ignore leading blank lines from the file and the -d option specifies sorting in dictionary order.