Remember The Silver Searcher? grab is another faster grep alternative that tries to use multiple cores. The author uses the techniques below:
- Parallel processing
- Uses mmap(2) with MAP_POPULATE and matches the whole file blob without counting newlines
- If available, grab also uses the PCRE JIT feature
- grab skips files which are too small to contain the regular expression
However, speedup for a single file is negligible. The performance boost is measurable in case of faster hardware like SSDs.
grab is designed to find string matches in large directory trees. However it doesn’t support as many options as grep, is not pipe-able and doesn’t work on stdin (which cannot be mmapped).
grab uses mmaped chunks of 1GB. For larger files, the last 4096 byte (1 page) of a chunk are overlapped, so that matches on a 1 GB boundary can be found. For this boundary matches, the results will show two entries with the same offset.
Compile grab from source to use it:
$ git clone https://github.com/stealth/grab.git $ cd grab $ make
grab uses a new pcre library, on some older systems the build can fail due to PCRE_INFO_MINLENGTH and pcre_study().
-O -- print file offset of match -l -- do not print the matching line (Useful if you want to see _all_ offsets; if you also print the line, only the first match in the line counts) -I -- enable highlighting of matches -c -- Use n cores in parallel (useless and even slower in most situations) n <= 1 uses single-core -r -- recurse on directory -R -- same as -r
On GitHub: grab