Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Posts Tagged ‘grep’

using nm to track down the origin of an undefined symbol.

Posted by peeterjoot on March 22, 2010

The shared library for our product for historical reasons is built with the (evil) -Bsymbolic flag.

This has a number of unfortunate side effects, one of which is deferring link errors due to unresolved symbols to executable link time, instead of when we build the shared library itself.

Having screwed up in my own build over the weekend, I see a link error in my build summary this morning:

/view/peeterj_m20/vbs/engn/lib/ undefined reference to `XXYY'

I happen to know exactly what caused this (this time), but just the other day I was asked by somebody to help them figure out something similar. So the topic seems worthy of a quick blog post.

Suppose one collects all the archive names that contribute to the shared library in question (for DB2 builders one can relink the library having done an ‘export VERY_VERBOSE=true’ on the command line).

A good way to get this is to redirect that build output to a file, grab just the link line and run something like:

# cat filewithRawLinkLine.txt | tr ' ' '\n' | grep -F -e .a -e .o > archivesAndObjectNames.txt

In my case, this looked like:

# head -5 archivesAndObjectNames.txt    

This now leaves you in the position to search for the archive(s) that contributed the undefined symbol. This can be done with:

# for i in `cat archivesAndObjectNames.txt` ; do echo $i ; nm $i | grep XXYY ; done | tee nm.out | grep -B1 XXYY
 /view/peeterj_m20/vbs/engn/sqe/alibsqe.a        U XXYY

So, the archive supplying a reference to this symbol is ‘alibsqe.a’ This completes the task of doing a first localization of where the link error is coming from. One can continue to do this in a brute force way finding the object file within the archive, or solve it by source inspection once one knows where to look a bit better. In an extremely large source base, where nobody really knows any sizeable portion of it, narrowing down the problem is often an important first step.

Posted in Development environment | Tagged: , , , | 1 Comment »

Shell tips and tricks. The most basic stuff.

Posted by peeterjoot on February 28, 2010

A while back I assembled the following shell tips and tricks notes for an ad-hoc ‘lunch and learn’ session at work. For some reason (probably for colour) I had made these notes in microsoft word instead of plain text. That made them of limited use for reference, not being cut and pastable (since word mucks up the quote characters). Despite a few things that are work centric (references to clearcase and our source code repository directory structure), there’s enough here that is generally applicable that the converted-to-text version makes sense to have available as a blog post.


# a=foo
# b=goo
# echo $a $b

foo goo
# p=/view/joes_view/vbs/engn/sqo
# diff $p/sqlofmga.C .

You will have many predefined variables when you login. Examples could include

$HOME                            home dir.
$EDITOR                          preferred editor.
$VISUAL                          preferred editor.
$REPLYTO                         where mail should be addressed from.
$PS1                             What you want your shell prompt to look like.
$TMPDIR                          Good to set to avoid getting hit as badly when /tmp fills up.
$CDPATH                          good for build tree paths.

CDPATH Example:


one can run: ‘cd sqo’ and go right to that component dir.

$CDPATH                            good for build tree paths.  Example: CDPATH=.:..:/home/hotelz/peeterj:/vbs/engn:/vbs/test/fvt/standalone:/vbs/common:/vbs/common/osse/core ; one can run: 'cd sqo' and go right to that component dir.
$1                                 First argument passed to a shell script (or shell function).
$*                                 All arguments that were passed to a shell script


All files starting with an 'a', and ending with a 'b'
# ls a*b

All files of the form 'a'{char}'b'

# ls a?b


Three different kinds. This is one of the most important things to know for any "shell programming".

Single quotes

Variables and wildcards are NOT expanded when contained in a set of single quotes. Example:

# a=foo
# b='goo boo'
# echo '$a $b'

$a $b

Double quotes

Variables and wildcards (*, ?, $...) are expanded (called globbing and/or interpolation sometimes depending on the context).

# a=foo
# b='goo boo'
# echo "$a $b"
foo goo boo

You don't have to double quote something for this sort of wildcard, and variable expansion, so you could write:

# echo $a $b

and the result will be the same:

foo goo boo

There is a difference though, namely, echo will treat this as three arguments, because the command is expanded before the final execution. This can be important when you want something with spaces to be treated as a single argument. Example:

# gcc x.c | grep ': error'

Back Quotes

Expression is executed as a command.

# cleartool checkin -c 'Please, let this compile and link this time.' `lsco`

Execution of a command in another one can also be done with a variable syntax (sometimes useful for nesting stuff). These would produce the same output:

# echo It once was: `date`
# echo It once was: $(date)

It once was: Mon Jun 18 16:20:28 EDT 2007

The alternate syntax can be useful if you wanted to run a command inside of a command.

Other Special Shell Characters

~              your home dir.
;              command separator
\              backslash (escape).  When you want to use a special character as is, you either have to single quote it, or use an escape character to let the shell know what you want.

Redirect input and output

|                            pipe input from another command
<                            redirect input
>                            redirect output
2>&1                         redirect stderr to stdin
echo hi > hi.out
cat < hi.out
cat /tmp/something | grep ': error' | sort
something_that_may_fail >/tmp/blah 2>&1
something_that_may_fail >/tmp/blah 2>/tmp/blah.err

The for loop.

If you have the quotes and variables mastered, this is probably the next most useful construct for ad-hoc command line stuff. We use computers for repetitive stuff, but it's amazing how little people sometimes take advantage of this.

By example:

# for i in `grep : /tmp/something` ; do echo $i ; done

Here, i is the variable you name, and you can reference it in the loop as $i.

# for i in `cat my_list_of_files` ; do cleartool checkout -nc $i ; done

If the command you want to run is something that accepts multiple arguments then you may even need a for loop. The second example above could be written:

# cleartool checkout -nc `cat my_list_of_files`

It's good to know both ways of doing this, since the backquote method can sometimes hit shell "environment length" limits that force you to split up what has to be done or to do parts individually.

Some common useful programs for command line use.

grep search for an expression or expressions

# gcc x.c | grep ': error'
# grep -n mystring `lsbranch -file` > o ; vim -q o
# grep -e something -e somethingelse
# grep -v '^ *$'                            # all non blank lines.

tr translate characters (one to another, to uppercase, ...)

# echo $PATH | tr : '\n'              # print out path elements on separate lines.
# tr '[a-z]' '[A-Z]'                  # upper case something.

cut Extract parts of a line

# cut -f1 -d:            # extract everything in the line(s) up until the first colon char.
# cut -f3-4              # extract from positions 3 and 4.

sort sort stuff

# sort                 # plain old sort.              
# sort -u              # unique lines only
# sort -n              # numeric sort
# sort -t :            # sort using alternate field separator (default is whitespace).

xargs Run a command on all the items in standard input.

# find . -type f -maxdepth 2 | xargs ls -l

sed search and replace program

# sed 's/string1/string2/g'
# sed 's/#.*//'                                          # remove script "comments"
# sed 's!//.*!!'                                           # remove C++ comments.
# sed 's/[ \t]*$//'                            # strip trailing whitespace
# sed 's/\(.*\):\(.*\)/\2:\1/'              # swap stuff separated by colons.

perl Any of the above and pretty much anything else you can think of.

Explaining perl is a whole different game, and if you don't know it, it probably won't look much different than ASCII barf (much like some of the sed commands above).

Some examples (things done above with a mix of other commands) :

# g++ x.c | perl -ne 'print if /: error/'
# perl -pe 's/string1/string2/g'
# perl -e ' $p = $ENV{PATH}; $p =~ s/:/\n/g ; print "$p\n" '
# perl -pe '$_ = uc($_)'

What's notable here is not the perl itself, but the fact that to run some of these commands required passing a pile of shell special characters. In order to pass these all to perl unaltered, it was required to use single quotes, and not double quotes.

Common to grep, sed, and perl is a concept called a regular expression (or regex). This is an extremely valuable thing to get to know well if you do any programming, since there's often a lot of text manipulation required as a programmer. Going into detail on this topic will require it's own time.

Shell Aliases

These are one liner "shortcuts". ksh/bash example:

alias la='ls -a'

Shell Functions

Multiline shortcuts. ksh/bash example:

function foo
   echo boo
   echo foo

This is similar to putting the commands in their own file and running that as a script, and can be used as helper functions in other scripts or as more complex "alias"es.

The example above could be written as:

alias foo='echo boo ; echo foo'

But functions also allow you do pass arguments. Example:

function debugattach {
    $1 ~/sqllib/adm/db2sysc --pid $(db2pd -edus -dbp $2 | perl -ne 'next unless s/db2sysc PID: // ; print')
alias ddda='debugattach ddd'
alias gdba='debugattach gdb'

calling this with ddda 0 will attach the ddd debugger to the db2sysc process db2pd reports to be node 0.

Except for the perl fragment, which is basically a combined 'grep' and 'sed', this example uses many of the things that have been explained above (variables, embedded command, single quotes to avoid expansion and for grouping arguments).

Posted in C/C++ development and debugging. | Tagged: , , , , | 1 Comment »

grepping a range of lines using perl.

Posted by peeterjoot on November 27, 2009

I was asked how to use grep to select everything in a file starting with a pattern, and ending with a different one. The file is our diagnostic log and if this has originated with one of our system testers could be massive (a few hundred thousand lines long). gnu-grep can be used for this. You could do something like:

grep -A 9999999 'some first expression' < wayToBigFile | grep -B 9999999 'some other expression'

Here 9999999 is some number of lines that is guessed big enough to contain all the lines of interest (not known ahead of time), so the command says “give me everything after the expression, and then give me everything before the other expression in that output”

Perl gives you a nicer way of doing this:

#!/usr/bin/perl -n

$foundIt = 1 if ( /some first expression/ ) ;

next if ( !$foundIt ) ;

print ;

exit if ( /some other expression/ ) ;

# done.

The -n flag says to run the whole script as if it is in a ‘while (<>){ … }’ loop.  Until the initial pattern is seen $foundIt is false, and nothing will be printed, and we bail if the second pattern is seen. Note that this relies on perl’s lazy variable initialization, since $foundIt = 0 until modified.

Observe also that this script is actually also it’s own test case.

myPrompt$ chmod 755 ./theScript
myPrompt$ ./theScript < ./theScript
$foundIt = 1 if ( /some first expression/ ) ;

next if ( !$foundIt ) ;

print ;

exit if ( /some other expression/ ) ;

Posted in perl and general scripting hackery | Tagged: , | 6 Comments »

Restricting pattern replacement to specific lines?

Posted by peeterjoot on September 29, 2009

We’ve had a marketing driven name change that impacts a lot of internal error strings in our code in a silly way, and I’ve got a lot of stuff like the following output

# grep -n '".* BA' *C
foo.C:1486:                       "Unexpected BA error - panic",
foo.C:1561:                       "Unexpected BA error - panic",
foo.C:1569:                       "Unexpected BA error",

All these ‘BA’s have to be changed to ‘BF’.

Ideally no customers would ever see these developer centric messages, but they will be in our logfiles, and potentially visible and confusing.

It’s not too hard to replace these, but there’s a lot of them. I’ve had this kind of task before, and have done it using hacky throw away command line “one-liners” like the following:

for i in `cut -f1,2 -d: grepoutput` ; do
   f=`echo $i | cut -f1 -d:`
   l=`echo $i | cut -f2 -d:`
   vim +$l +:'s/\/BF/g' +wq $i

Okay, it’s not a one liner as above since I’ve formatted this with newlines instead of semicolons, but when tossing off throwaway bash/ksh for loop stuff like this I usually do it as a one liner. This bash loop is easy enough to write, but messy and also fairly easy to get wrong. I’m tired of doing this over and over again.

It seemed to me that it was time to code up something that I can tweak for automated tasks like this, and wrote the perl script below that consumes grep -n ouput (ie. file:lineno:stuff output), and makes the desired replacements, whatever they are. I’ve based this strictly on the grep output because the unrestricted replacements could be dangerous and I wanted to visually verify that all the replacement sites were appropriate.


my %lines ;

while (<>)
   chomp ;

   /^(.*?):(\d+?):/ or die "unexpected grep -n output on line '$_'\n" ;

   $lines{$1} .= ",$2" ;

foreach (keys %lines)
   process_file( $_, split(/,/, $lines{$_} ) ) ;

exit ;

sub process_file
   my ($filename, @n) = @_ ;

   my %thisLines = map { $_ => 1 } @n ;

   open my $fhIn, "<$filename" or die "could not open file for input '$filename'\n" ;
   open my $fhOut, ">$filename.2" or die "could not open file for input '$filename.2'\n" ;

   my $lineno = 0 ;
   while ( <$fhIn> )
      $lineno++ ;

      if ( exists($thisLines{$lineno}) )
#print "$filename:$lineno: operating on: '$_'\n" ;
         # word delimiters to replace BA but not BASH nor ABAB, ...
         s/\bBA\b/BF/g ;

      print $fhOut $_ ;

   close $fhIn ;
   close $fhOut ;

This little script, while certainly longer than the one-liner method, is fairly straightforward and easy to modify for other similar ad-hoc replacement tasks later. However, I have to wonder if there’s an easier way?

Posted in perl and general scripting hackery | Tagged: , , | 2 Comments »

grep with a range using a perl one liner.

Posted by peeterjoot on September 23, 2009

Here’s a small shell scripting problem. I have grep output that I’d like further filtered

# ./displayInfo | grep peeterj
Offline RG:blah_peeterj_0-rg
Offline RG:blah_peeterj_0-rg_MLN-rg
Offline RG:idle_peeterj_998_goo-rg
Offline RG:primary_peeterj_900-rg
Online Equivalency:blah_peeterj_0-rg_MLN-rg_group-equ
Online Equivalency:instancehost_peeterj-equ
        '- Online APP:instancehost_peeterj_goo:goo
Online Equivalency:primary_peeterj_900-rg_group-equ

peeterj is my userid and the command in question that generates this has output for all the other userids on the system. I’m only interested in the subset of the info that has my name, hence the grep.

However, once the text Equivalency is displayed I’m not interested in any more of it. I could filter out all the patterns that I’m also not interested in doing something like:

# ./displayInfo | grep peeterj | grep -v -e Equivalency -e instancehost -e ...

but this is a bit cumbersome. An alternative is filtering specifically on what I want

# ./displayInfo | grep -e 'peeterj.*blah' -e 'peeterj.*idle' -e 'primary_peeterj' -e ...

but that also means I have to know and enumerate all such expressions for what I’m interested in. Since I’m on Linux my grep is gnu-grep, so I considered using ‘grep -B N’ to show N lines of text precededing a match, but this also outputs the text I’m not interested in so doesn’t really work.

Here’s what I came up with (I’m sure there’s lot of ways, some perhaps easier, but I liked this one). It uses the perl filtering option -n once again, to convert the entire script into a filter (specified here inline with -e instead of in a file) :

# ./displayInfo | perl -n -e 'next unless (/$ENV{USER}/) ; last if ( /Equivalency/ ) ; print ;'

Basically, this one liner is as if I’d written the following perl script to read and process all of STDIN:


while (<>)
   my $line = $_ ;
   next unless ($line =~ /$ENV{USER}/) ; # "grep" for my userid (peeterj).

   # Won't get here until I've started seeing the peeterj text.
   last if ( $line =~ /Equivalency/ ) ;         # stop when I've seen enough.

   # If I get this far print the "matched" line as is:
   print $line ;

Posted in perl and general scripting hackery | Tagged: , , | Leave a Comment »

A combined application for grep -n ; vim -q ; and perl evaluated regex

Posted by peeterjoot on September 11, 2009

Now that I’ve learned of how to use evaluated replacement expressions in perl it’s become my new favorite tool. Here’s today’s application, using it as a query engine to figure out all the calls of a particular function that I want to look at in the editor and probably modify.

I’m interested in editing a subset of the function calls for the module in a given directory. I can find them and their line numbers with:

grep -n printIt.*BLAH *.C

But there’s 90 of these function calls, and I know most don’t need alteration. If I grep with context, say grabbing 20 lines of context after the search expression, I can see which of these are of interest:

grep -nA20 printIt.*BLAH *.C | tee grep.out

I really want to weed out all the calls that also do NOT contain additional expressions. Illustrating by example, a fragment of the grep output above had in it:

foo.C:6197:   printIt( BLAH,
foo.C-6198-          ...
foo.C-6200-          INFORMATIONAL,
foo.C-6205-          ...
foo.C-6210-          ) ;

Any of these calls that happen to have INFORMATIONAL or DUMPIT strings in them aren’t of interest, so I take my pre-canned evaluated regex perl script (see previous posts for an explaination) and modify it slightly.

This time I use:

# cat ./thisFilterScript

while (<>)
   $p .= $_ ;

$p =~ s/(printIt.*?;)/foo("$1")/smeg ;
print $p ;

exit ;

sub foo
   my $s = "@_" ;

   return "" if ( $s =~ /INFORMATIONAL/ or $s =~ /DUMPIT/ ) ;

   return "$s" ;

Run this on the grep output, and I’ve now reduced it to just a listing of the calls of interest:

# cat grep.out | ./thisFilterScript > grep.filtered

This is now just the filename:linenumber:output expressions for each of the function calls of interest.

# cat grep.filtered
foo.C:6303:         printIt( BLAH,
foo.C:6344:         printIt( BLAH,
foo.C:10298:   printIt( BLAH,
foo.C:10325:   printIt( BLAH,

I can now simply run ‘vim -q ./grep.filtered’, and I go straight to the line for the first hit (with :cn to get to the next when done with editing that call site).

Posted in perl and general scripting hackery | Tagged: , , , , | Leave a Comment »

sorting on a different field

Posted by peeterjoot on September 3, 2009

Here’s a simple Unix command line task that at least one co-worker here at IBM didn’t know how to do.

Have log files with a timestamp in them, and want to see the first one. The timestamp is within the file, so we can grep it out, obtaining something like:

# grep timestamp *.stack
19177.100.000.stack.txt:timestamp: 2009-09-01-
19177.1.000.stack.txt:timestamp: 2009-09-01-
19177.101.000.stack.txt:timestamp: 2009-09-01-
19177.102.000.stack.txt:timestamp: 2009-09-01-
19177.103.000.stack.txt:timestamp: 2009-09-01-
19177.104.000.stack.txt:timestamp: 2009-09-01-
19177.105.000.stack.txt:timestamp: 2009-09-01-
19177.106.000.stack.txt:timestamp: 2009-09-01-

piping this output through sort doesn’t do what’s desired, since that sorts, probably alphabetically. A sort on what follows after the space would do the trick. This is a common requirement, and only requires one extra sort parameter

# grep timestamp *txt | sort -k2  | head -3
19177.177.000.stack.txt:timestamp: 2009-09-01-
19177.158.000.stack.txt:timestamp: 2009-09-01-
19177.86.000.stack.txt:timestamp: 2009-09-01-

The -k2 option on sort says to sort on the second key. Spaces delimit the sort keys by default. You could change the delimiter if required, but it doesn’t really help here. An example of doing so and getting the same results would be:

# grep timestamp *txt | sort -k3 -t:  | head -3

This says sort on the third key, and delimit the sort patterns by colons instead of spaces. You can get really fancy with the sort command line options specifying secondary and higher sort keys, and different sort modifiers for different keys (numeric for some, increasing or decreasing, …) but knowing how to use just -k and -t will do the trick in many instances. If you want really fancy sorts you are probably better off using perl anyhow where you can write sort subroutines and have the ultimate control.

Posted in perl and general scripting hackery | Tagged: , | Leave a Comment »

A vim -q Plus grep -n. An editor trick everybody should know.

Posted by peeterjoot on July 20, 2009

If you use vi as your editor (and by vi I assume vi == vim), then you want to know about the vim -q option, and grep -n to go with it.

This can be used to navigate through code (or other files) looking at matches to patterns of interest. Suppose you want to look at calls of strchr() that match some pattern. One way to do this is to find the subset of the files that are of interest. Say:

$ grep strchr.*ode sqle*ca*C
sqlecatd.C:                              sres = strchr(SQLZ_IDENT, (Uint8)nodename[i]);
sqlecatn.C:                  if ((sres = strchr(SQLZ_DBNAME, (Uint8)mode[0])) != NULL)
sqlecatn.C:                        sres = strchr(SQLZ_IDENT_AID, (Uint8)mode[i]);

and edit all those files, searching again for the pattern of interest in each file. If there aren’t many such matches, your job is easy and can be done manually. Suppose however that there’s 20 such matches, and 3 or 4 are of interest for editing, but you won’t know till you’ve seen them with a bit more context. What’s an easy way to go from one to the next? The trick is grep -n plus vim. Example:

$ grep -n strchr.*ode sqle*ca*C | tee grep.out
sqlecatd.C:710:                              sres = strchr(SQLZ_IDENT, (Uint8)nodename[i]);
sqlecatn.C:505:                  if ((sres = strchr(SQLZ_DBNAME, (Uint8)mode[0])) != NULL)
sqlecatn.C:518:                        sres = strchr(SQLZ_IDENT_AID, (Uint8)mode[i]);

$ vim -q grep.out

vim will bring you right to line 710 of sqlecatd.C in this case. To go to the next pattern, which will be in this case also in the next file, use the vim command


You can move backwards with :cN, and see where you are and the associated pattern with :cc

vim -q understands a lot of common filename/linenumber formats (and can probably be taught more but I haven’t tried that). Of particular utility is compile error output. Redirect your compilation error output (from gcc/g++ for example) to a file, and when that file is stripped down to just the error lines, you can navigate from error to error with ease (until you muck up the line numbers too much).

A small note. If you are grepping only one file, then the grep -n output won’t have the filename and vim -q will get confused. Example:

grep -n strchr.*ode sqlecatn.C
505:                  if ((sres = strchr(SQLZ_DBNAME, (Uint8)mode[0])) != NULL)
518:                        sres = strchr(SQLZ_IDENT_AID, (Uint8)mode[i]);

Here just include a filename that doesn’t exist in the grep command line string

grep -n strchr.*ode sqlecatn.C blah 2>/dev/null
sqlecatn.C:505:                  if ((sres = strchr(SQLZ_DBNAME, (Uint8)mode[0])) != NULL)
sqlecatn.C:518:                        sres = strchr(SQLZ_IDENT_AID, (Uint8)mode[i]);

I’m assuming here that you don’t have a file called blah in the current directory. The result is something that vim -q will still be happy about.

Posted in perl and general scripting hackery | Tagged: , | Leave a Comment »