grepping a range of lines using perl.

November 2009
M	T	W	T	F	S	S
	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Posted by peeterjoot on November 27, 2009

I was asked how to use grep to select everything in a file starting with a pattern, and ending with a different one. The file is our diagnostic log and if this has originated with one of our system testers could be massive (a few hundred thousand lines long). gnu-grep can be used for this. You could do something like:

grep -A 9999999 'some first expression' < wayToBigFile | grep -B 9999999 'some other expression'

Here 9999999 is some number of lines that is guessed big enough to contain all the lines of interest (not known ahead of time), so the command says “give me everything after the expression, and then give me everything before the other expression in that output”

Perl gives you a nicer way of doing this:

#!/usr/bin/perl -n

$foundIt = 1 if ( /some first expression/ ) ;

next if ( !$foundIt ) ;

print ;

exit if ( /some other expression/ ) ;

# done.

The -n flag says to run the whole script as if it is in a ‘while (<>){ … }’ loop. Until the initial pattern is seen $foundIt is false, and nothing will be printed, and we bail if the second pattern is seen. Note that this relies on perl’s lazy variable initialization, since $foundIt = 0 until modified.

Observe also that this script is actually also it’s own test case.

myPrompt$ chmod 755 ./theScript
myPrompt$ ./theScript < ./theScript
$foundIt = 1 if ( /some first expression/ ) ;

next if ( !$foundIt ) ;

print ;

exit if ( /some other expression/ ) ;

This entry was posted on November 27, 2009 at 10:29 am and is filed under perl and general scripting hackery. Tagged: grep, perl. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

6 Responses to “grepping a range of lines using perl.”

Garfield Lewis said

November 27, 2009 at 10:47 am
Tried the perl script and it works like a charm.

Thx, Peeter…

Reply
Robin Bate Boerop said

November 27, 2009 at 11:44 am
“The script is actually also its own testcase” – nice!

Reply
Jacobo de Vera said

December 1, 2009 at 9:31 am
You can also use sed for this:

sed '/some first expression/,/some other expression/!d' file

Even with line numbers (this shows lines from 12 to 20):

sed '12,20!d' file

Reply
- peeterjoot said
  
  December 1, 2009 at 10:15 am
  With the perl code you can do more complex stuff on what was matched if you want, but if plain old extraction is the goal this sed method is definitely superior.
  
  Thanks for the tip.
  
  Reply

Jotr said

September 19, 2010 at 1:53 pm

Sounds like you really want the range operator:

http://perldoc.perl.org/perlop.html#Range-Operators

peeterjoot said

September 20, 2010 at 3:28 pm

Hey thanks Jotr, I was hoping that by posting this, somebody would eventually supply a better and easier way. This works nicely for a one liner range grep, as in the following:

# perl -n -e 'next unless ( /E449758E484/ .. /db2spcat/ ) ; print ;' db2diag.log
2010-09-20-16.14.21.168949-240 E449758E484           LEVEL: Event
PID     : 3133                 TID  : 47625558550848 KTID : 5843
PROC    : db2sysc 0
INSTANCE: peeterj              NODE : 000          DB   : CORAL
APPHDL  : 0-54                 APPID: *N0.peeterj.100920201417
AUTHID  : PEETERJ
EDUID   : 24                   EDUNAME: db2agent (CORAL) 0
FUNCTION: DB2 UDB, base sys utilities, sqeLocalDatabase::FirstConnect, probe:1000
START   : DATABASE: CORAL    : ACTIVATED: NO

2010-09-20-16.14.21.748602-240 I450243E460           LEVEL: Warning
PID     : 3133                 TID  : 47625558550848 KTID : 5843
PROC    : db2sysc 0
INSTANCE: peeterj              NODE : 000          DB   : CORAL
APPHDL  : 0-54                 APPID: *N0.peeterj.100920201417
AUTHID  : PEETERJ
EDUID   : 24                   EDUNAME: db2agent (CORAL) 0
FUNCTION: DB2 UDB, base sys utilities, db2spcat, probe:173

Do you know if it is possible to combine a range boundry with a number as in something like:

next unless ( /E449758E484/ .. /db2spcat/+10 ) ;

I was thinking along the lines of gnu-grep -A10 (in this case I wanted grep for everything in a range delimited by a pair of patterns) but a few extra lines of context. Trying that with /db2spcat/+10 doesn’t appear to work.

	Determining the alig… on C structure alignment pad…
	Manas shetty on Cartesian to spherical change…
	peeterjoot on Derivative recurrence relation…
	Daniel Pires on Derivative recurrence relation…
	peeterjoot on Curious problem using the vari…

Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Categories

Archives

Recent Posts

Meta

Recent Comments

People not reading this blog: 7,179,979,522 minus:

Subscribe