grab: grep faster

search_compRemember The Silver Searcher? grab is another faster grep alternative that tries to use multiple cores. The author uses the techniques below:

  • Parallel processing
  • Uses mmap(2) with MAP_POPULATE and matches the whole file blob without counting newlines
  • If available, grab also uses the PCRE JIT feature
  • grab skips files which are too small to contain the regular expression

However, speedup for a single file is negligible. The performance boost is measurable in case of faster hardware like SSDs.

grab is designed to find string matches in large directory trees. However it doesn’t support as many options as grep, is not pipe-able and doesn’t work on stdin (which cannot be mmapped).

grab uses mmaped chunks of 1GB. For larger files, the last 4096 byte (1 page) of a chunk are overlapped, so that matches on a 1 GB boundary can be found. For this boundary matches, the results will show two entries with the same offset.

Installation

Compile grab from source to use it:

$ git clone https://github.com/stealth/grab.git
$ cd grab
$ make

grab uses a new pcre library, on some older systems the build can fail due to PCRE_INFO_MINLENGTH and pcre_study().

Usage

Options:

-O     -- print file offset of match
-l     -- do not print the matching line (Useful if you want
          to see _all_ offsets; if you also print the line, only
          the first match in the line counts)
-I     -- enable highlighting of matches
-c  -- Use n cores in parallel (useless and even slower in most 
          situations)
          n <= 1 uses single-core
-r     -- recurse on directory
-R     -- same as -r

On GitHub: grab

Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s