sorting on a different field
Posted by peeterjoot on September 3, 2009
Here’s a simple Unix command line task that at least one co-worker here at IBM didn’t know how to do.
Have log files with a timestamp in them, and want to see the first one. The timestamp is within the file, so we can grep it out, obtaining something like:
# grep timestamp *.stack 19177.100.000.stack.txt:timestamp: 2009-09-01-220.127.116.112278 19177.1.000.stack.txt:timestamp: 2009-09-01-18.104.22.1681855 19177.101.000.stack.txt:timestamp: 2009-09-01-22.214.171.1241934 19177.102.000.stack.txt:timestamp: 2009-09-01-126.96.36.1996051 19177.103.000.stack.txt:timestamp: 2009-09-01-188.8.131.525604 19177.104.000.stack.txt:timestamp: 2009-09-01-184.108.40.2068246 19177.105.000.stack.txt:timestamp: 2009-09-01-220.127.116.115395 19177.106.000.stack.txt:timestamp: 2009-09-01-18.104.22.1681395 ...
piping this output through sort doesn’t do what’s desired, since that sorts, probably alphabetically. A sort on what follows after the space would do the trick. This is a common requirement, and only requires one extra sort parameter
# grep timestamp *txt | sort -k2 | head -3 19177.177.000.stack.txt:timestamp: 2009-09-01-22.214.171.1243622 19177.158.000.stack.txt:timestamp: 2009-09-01-126.96.36.1994201 19177.86.000.stack.txt:timestamp: 2009-09-01-188.8.131.525494 ...
The -k2 option on sort says to sort on the second key. Spaces delimit the sort keys by default. You could change the delimiter if required, but it doesn’t really help here. An example of doing so and getting the same results would be:
# grep timestamp *txt | sort -k3 -t: | head -3 ...
This says sort on the third key, and delimit the sort patterns by colons instead of spaces. You can get really fancy with the sort command line options specifying secondary and higher sort keys, and different sort modifiers for different keys (numeric for some, increasing or decreasing, …) but knowing how to use just -k and -t will do the trick in many instances. If you want really fancy sorts you are probably better off using perl anyhow where you can write sort subroutines and have the ultimate control.