May 2024
M	T	W	T	F	S	S
	1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Posts Tagged ‘hash key’

comparing some times for perl vs sort command line hacking

Posted by peeterjoot on June 22, 2010

I had a 2M line file that contained among other things function identifier strings such as:

SAL_MANAGEMENT_PORT_HANDLE::SAL_ManagementGetServerRole
SAL_MANAGEMENT_PORT_HANDLE::SAL_ManagementHandleClose
SAL_MANAGEMENT_PORT_HANDLE::SAL_ManagementHandleOpen

I wanted to extract just these and sort them by name for something else. I’d first tried this in vim, but it was taking too long. Eventually I control-C’ed it and realized I had to be a bit smarter about it. I figured something like perl would do the trick, and I was able to extract those strings easily with:

cat flw.* | perl -p -e 's/.*?(\S+::\S+).*/$1/;'

(ie: grab just the not-space::not-space text and spit it out). passing this to ‘sort -u’ was also taking quite a while. Here’s a slightly smarter way to do it, still also a one-liner:

cat flw.* | perl -n -e 's/.*?(\S+::\S+).*/$h{$1}=1/e; END{ foreach (sort keys %h) { print "$_\n" ; } } '

All the duplicates are automatically discarded by inserting the matched value into a hash instead of just spitting it out. Then a simple loop over the hash keys gives the result directly. For the data in question, this ended up reducing the time required for the whole operation to just 12.5seconds (eventually I ran the original ‘perl -… | sort -u’ in the background and found it would have taken 1.6 minutes). It took far less time to tweak the command line than the original command would have taken, and provides a nice example where an evaluated expression in the regex match can be handy.

Of course, I then lost my time savings by writing up these notes for posterity;)

Posted in perl and general scripting hackery | Tagged: filter, hash key, perl, regular expression, sort | 4 Comments »

	Determining the alig… on C structure alignment pad…
	Manas shetty on Cartesian to spherical change…
	peeterjoot on Derivative recurrence relation…
	Daniel Pires on Derivative recurrence relation…
	peeterjoot on Curious problem using the vari…

Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Categories

Archives

Recent Posts

Meta

Recent Comments

People not reading this blog: 7,179,979,522 minus:

Subscribe

Posts Tagged ‘hash key’

comparing some times for perl vs sort command line hacking