- Linux Shell Scripting Cookbook(Third Edition)
- Clif Flynt Sarath Lakshman Shantanu Tushar
- 400字
- 2021-07-09 19:46:20
Sorting according to keys or columns
We can use a column with sort if the input data is formatted like this:
$ cat data.txt 1 mac 2000 2 winxp 4000 3 bsd 1000 4 linux 1000
We can sort this in many ways; currently it is sorted numerically, by the serial number (the first column). We can also sort by the second or third column.
The -k option specifies the characters to sort by. A single digit specifies the column. The -r option specifies sorting in reverse order. Consider this example:
# Sort reverse by column1 $ sort -nrk 1 data.txt 4 linux 1000 3 bsd 1000 2 winxp 4000 1 mac 2000 # -nr means numeric and reverse # Sort by column 2 $ sort -k 2 data.txt 3 bsd 1000 4 linux 1000 1 mac 2000 2 winxp 4000
When -k is followed by a single integer, it specifies a column in the text file. Columns are separated by space characters. If we need to specify keys as a group of characters (for example, characters 4-5 of column 2), we define the range as two integers separated by a period to define a character position, and join the first and last character positions with a comma:
$ cat data.txt 1 alpha 300 2 beta 200 3 gamma 100 $ sort -bk 2.3,2.4 data.txt ;# Sort m, p, t 3 gamma 100 1 alpha 300 2 beta 200
The highlighted characters are to be used as numeric keys. To extract them, use their positions in the lines as the key format (in the previous example, they are 2 and 3).
To use the first character as the key, use this:
$ sort -nk 1,1 data.txt
To make the sort's output xargs compatible with the \0 terminator, use this command:
$ sort -z data.txt | xargs -0 # Use zero terminator to make safe use with xargs
Sometimes, the text may contain unnecessary extraneous characters such as spaces. To sort them in dictionary order, ignoring punctuations and folds, use this:
$ sort -bd unsorted.txt
The -b option is used to ignore leading blank lines from the file and the -d option specifies sorting in dictionary order.