Squeezing characters with tr

The tr command can perform many text-processing tasks. For example, it can remove multiple occurrences of a character in a string. The basic form for this is as follows:

tr -s '[set of characters to be squeezed]' 

If you commonly put two spaces after a period, you'll need to remove extra spaces without removing duplicated letters:

$ echo "GNU is       not     UNIX.  Recursive   right ?" | tr -s ' '
GNU is not UNIX. Recursive right ?

The tr command can also be used to get rid of extra newlines:

$ cat multi_blanks.txt | tr -s '\n'
line 1
line 2
line 3
line 4

In the preceding usage of tr, it removes the extra '\n' characters. Let's use tr in a tricky way to add a given list of numbers from a file, as follows:

$ cat sum.txt
1
2
3
4
5

$ cat sum.txt | echo $[ $(tr '\n' '+' ) 0 ]
15

How does this hack work?

Here, the tr command replaces '\n' with the '+' character, hence, we form the string 1+2+3+..5+, but at the end of the string we have an extra + operator. In order to nullify the effect of the + operator, 0 is appended.

The $[ operation ] performs a numeric operation. Hence, it forms this string:

echo $[ 1+2+3+4+5+0 ]

If we used a loop to perform the addition by reading numbers from a file, it would take a few lines of code. With tr, a one–liner does the trick.

Even trickier is when we have a file with letters and numbers and we want to sum the numbers:

$ cat test.txt
first 1
second 2
third 3

We can use tr to strip out the letters with the -d option, then replace the spaces with +:

$ cat test.txt | tr -d [a-z] | echo "total: $[$(tr ' ' '+')]"
total: 6