Feature comparison of ack, ag, Git-grep, grep and ripgrep

john-tells-all · on Oct 24, 2021

I adore Ripgrep and use it dozens of times a day, and have for years. It's extremely fast, does the right thing most of the time, and has a useful featureset.

Ack is also nice, I've used that quite a bit too. It has the advantage of being in Perl, so if you're on a "secure" computer (no compiler), you can still use a fast + featureful search tool.

petdance · on Oct 24, 2021

I'm glad you appreciate that you can install ack anywhere. It's exactly for that use case that I have kept ack able to be a single text file download. Also, it only requires Perl 5.10.1, so it's OK if you're using an old Perl.

oxplot · on Oct 25, 2021

Hmm — there is a project that allows you to compile a single binary that is cross platform across Linux, Mac and Windows. Wondering if ripgrep can be compiled in such a way and that would make it very close to the portability you have with ack.

petdance · on Oct 25, 2021

That still might run afoul of locked down networks like I've seen at banks. Users couldn't install any binaries at all, but with something like ack it's just cut & paste some text into a text file. ack can literally be a select-all, Ctrl-C, switch windows, "cat > ack", Ctrl-V.

sva_ · on Oct 24, 2021

I feel the same way about ripgrep, and also fzf - especially both in combination. I only started using them both a few months ago, yet it feels like a fundamental way of doing computing.

bloopernova · on Oct 24, 2021

fzf is wonderful. I feel like we're only scratching the surface of its utility. Using it with git add is so fast, just "ga" (an alias), hit tab (for each file) then enter.

hahfjjjshj · on Oct 24, 2021

I find interactive git add more useful most of the time, 'git add -i' then select what's needed from the menu

oconnor663 · on Oct 24, 2021

Figuring out how to integrate that with FZF would be really something. Being able to easily go up and down the list, and visualize the whole thing, would make things a lot smoother.

hahfjjjshj · on Oct 24, 2021

Any examples of using them in combination? I've only recently started using ripgrep despite being aware of it for a while.

petdance · on Oct 24, 2021

There are a ton of articles out there about it:

https://duckduckgo.com/?q=ripgrep+%2B+fzf

spinax · on Oct 24, 2021

Some of these line items are really obtuse and in some cases just not right, "Don't search in binary files" and "Treat binary files as if they were text" for example caught my eye - GNU grep has the `--binary-files` option which supports both of these features. Others like "can pipe output to a pager" seem like a half-hearted attempt to give a +1 to a specific tool while ignoring that you can... pipe the output of any of them using... a pipe.

opencl · on Oct 24, 2021

If you follow the "If you have updates to the chart, please submit as a GitHub issue." link, you can see that there are a few dozen open issues and the page was last updated 2 years ago.

spinax · on Oct 24, 2021

Ahhh, I did not (my bad) - I don't track grep features, not a hobby :) this chart could have been more truthy 2 years ago for all I know. Thanks.

petdance · on Oct 24, 2021

If there are things you think are confusing or inaccurate, please do make GitHub issues for them. Thanks.

loeg · on Oct 24, 2021

Maybe we need a (2019) in the title.

petdance · on Oct 24, 2021

Or maybe I just need to move it up in my priority stack.

petdance · on Oct 24, 2021

Here's another site of mine y'all may be interested in: https://altbox.dev/

It's a collection of improved shell tools, organized by the tool they supplement.

As with this feature comparison chart, patches and suggestions are welcome: https://github.com/petdance/altbox

WalterGR · on Oct 25, 2021

You should submit that to HN!

infocollector · on Oct 24, 2021

When I need a faster grep, I love ugrep (https://github.com/Genivia/ugrep - especially when I am searching compressed logs for debugging).

petdance · on Oct 24, 2021

What do you love about ugrep?

hahfjjjshj · on Oct 24, 2021

This is preferred over ripgrep? I've not used ugrep before

dotancohen · on Oct 24, 2021

I would absolutely love to see examples of each tools' syntax for each use case.

In particular "Show proximity of matches to other matches" would be a huge boon to replace `grep -C 5 foo | grep bar`.

oweiler · on Oct 24, 2021

ripgrep supports the most common features while still being much faster than ack.

So it's no comparison for me.

loeg · on Oct 24, 2021

Ag (the_silver_searcher) has performance closer to ripgrep and a similar feature set. But it’s tough to beat ripgrep both for performance and reliability. Rust is great in this application.

gnfargbl · on Oct 24, 2021

I occasionally find myself wanting to search a data stream using a large-ish set (a few tens or hundreds of thousands) of regexes. This is very slow with a backtracking engine like PCRE, but ought to be pretty fast with a DFA-based engine like re2.

So far, I have been unsuccessful in finding a grep replacement that can read patterns from a file, and which also uses a DFA engine. Does one exist? From the table, it looks like ripgrep might be suitable. Is it?

burntsushi · on Oct 24, 2021

Precisely speaking, no, I don't know of any grep tools that use a DFA engine. However, both ripgrep and GNU grep use a hybrid NFA/DFA engine (also known as a "lazy DFA") for some subset of regexes. I'm not too familiar with all of GNU grep's strategies, but for ripgrep, it will fall back to an NFA engine. (And I don't mean Friedl's bastardization of the term "NFA engine.") For ripgrep, see the --dfa-size-limit flag to try to let it use the hybrid NFA/DFA engine for bigger regexes. Whether it helps or not depends on your situation.

Now, this will do much better than a backtracking engine, but if you get up into the tens of thousands or hundreds of thousands of regexes, it's going to get pretty painful. Finite automata just doesn't scale that well. At that point, you really start wanting a more specialized solution. Probably the best answer to that that I know of is Hyperscan. And you're in luck; someone maintains a fork of ripgrep with support for Hyperscan: https://sr.ht/~pierrenn/ripgrep/

(A special case is tens of thousands of literal patterns. ripgrep will notice that and should use Aho-Corasick. It doesn't help so much with search time since it's just a NFA or a DFA like with regexes, but the machine itself is constructed much more quickly.)

gnfargbl · on Oct 24, 2021

What an incredibly helpful reply. Thank you!

It sounds like either plain ripgrep, or ripgrep+hyperscan, is pretty much exactly what I'm looking for. Next time I have this problem, I'll certainly be reaching for it.

vlovich123 · on Oct 24, 2021

Out of curiosity, what prevents the hyperscan support from being mainlined?

burntsushi · on Oct 24, 2021

Too much of a weighty dependency and too much of a niche IMO. For example, the last time I tried to build Hyperscan, I failed and gave up after 15 minutes of trying.

vlovich123 · on Oct 25, 2021

I would have thought you just need to include the rust-hyperscan crate[1] which would take care of that for you (but that crate probably didn't exist when you looked at it). I don't have a sense on the impact it has on overall binary size.

[1] https://crates.io/crates/hyperscan

burntsushi · on Oct 25, 2021

I don't think the existence of a crate or not really impacts anything I said. More to the point, it would put a reliance on someone else to maintain a crate for critical functionality in ripgrep. (And if that fell through, I would invariably need to pick up that burden. Removing functionality is a lot harder than adding it.)

It makes a lot more sense to me for something like Hyperscan to be maintained out of tree. I did work with the patch author a bit, and in particular, made some changes to ripgrep to make maintaining such a fork easier: https://github.com/BurntSushi/ripgrep/issues/1488

Bottom line is, a lot of people think that adding a dependency has nearly zero cost. But it doesn't. Not by a long shot.

jemfinch · on Oct 24, 2021

I've written this code (in C++) for an employer. RE2 scaled fine to hundreds of thousands of regexes. You'll want to use RE2::Set, which compiles multiple regexes into a single DFA, and probably the "Filter" functionality (whose name I don't precisely remember and am too lazy to look up) which uses an Aho-Corasick tree to subset the potential matches. One thing you'll have to watch out for is RE2's maximum DFA size; if compilation of your RE2::Set fails, just split your set of regexes in half and compile again.

You could probably do some fun optimizations by grouping the regexes which depend on the same literals into their own sets, but I never needed to.

burntsushi · on Oct 24, 2021

This is basically what ripgrep will do for you automatically. (ripgrep uses Rust's regex engine, which is a descendant of RE2.) But when you get up into hundreds of thousands of regexes, the NFA (and the resulting DFA) get really big. And things generally don't scale that well. Here's a good example: http://web.archive.org/web/20210302010420/https://01.org/hyp...

The problem is that for a big enough NFA, you'll wind up spending most of your search doing powerset construction to build the DFA.

> One thing you'll have to watch out for is RE2's maximum DFA size

You can configure this in ripgrep with the --dfa-size-limit flag. (See also --regex-size-limit.)

mi_lk · on Oct 24, 2021

Nice, would be better if there are examples of each feature to show how their cli flags map to others

petdance · on Oct 24, 2021

That's how I had it at first.

I originally had it as a "phrasebook" of how to do the same thing in the different tools, but it was really ugly and took up a lot of horizontal space, and I figured it was more useful as a chart of yes/no. Also, there were cases where two tools had pretty much the same feature, but not exactly, so just listing flags didn't make sense.

I've still got a lot of the data of the switches in the JSON file that I build the chart from. https://github.com/beyondgrep/website/blob/dev/features.json If you've got ideas on how to bring back the phrasebook format, either integrated into this page, or as a separate standalone page, I'd love to hear them. Maybe the phrasebook isn't best done as a table like this, for example. Open a ticket in GitHub and let me know your thoughts.

TacticalCoder · on Oct 24, 2021

I use ripgrep and more typically rigrep from within Emacs, thanks to "counsel-rg". I configured counsel-rg (as suggested) to not display very long matching lines (for Emacs doesn't like lines that are too long).

It is really very fast.

beermonster · on Oct 24, 2021

Doom emacs uses it if it’s installed

jll29 · on Oct 24, 2021

The next step is _Extended_ regular expressions that characterise Regular Relations (RR) and that specify Finite Strate Transducers (FSTs). See:

http://users.itk.ppke.hu/~sikbo/nytech/gyak/05_morfo/xfst/bo...

Advances:

- symmetry input:output (reversable)

- readable/maintainable expressions due to _naming_ of sub-expression

Implementations:

- Xerox XRCE XFST/lexc/twolc compilers

- FOMA - https://fomafst.github.io/

asicsp · on Oct 24, 2021

I wonder how long would the documentation for GNU grep continue to say this:

>PCRE support is here to stay, but consider this option experimental when combined with the -z (--null-data) option, and note that ‘grep -P’ may warn of unimplemented features.

I did come across a few issues mentioned with -z on unix.stackexchange a few years back but they have been fixed as far as I know.

waynesonfire · on Oct 24, 2021

Apparently,

> Print lines by number

is not supported by GNU grep???

  $ grep --version
  grep (GNU grep) 2.25
  $ cat world
  hello
  $ grep -Hn hello world
  world:1:hello

tialaramex · on Oct 24, 2021

"Print lines by number" is a vague thing to say, particularly since the comparison later includes, "Print specific lines by number".

However, the grep -Hn feature is described in this comparison as, "Prefix the line number to matching lines"

One thing that can help people to compare this sort of tool is to pair technical descriptions like command line parameters with the natural language explanation. If tool foo has a feature you're describing as "Prevent cheesecake" then I have no idea if my tool bar can do that, whereas if you say this is -Xqm then I can read the documentation and discover that I call this "disable refrigerated dessert" and it's -VQb so yes, my tool does this too.

I spent some time recently reading the proposals to fix/ extend C++ ranges P2214 - and because this general idea is very common they often discuss Haskell, Rust or even Python. If you're experienced in a language you already know whether it would spell something FlatMap, flat_map, or flatMap but you might not guess that C++ people would call your filter_map by the name transform_maybe, or as a C++ programmer who has barely dipped their toe in Haskell you wouldn't know that Haskell doesn't use the word "transform" in this context and without being told what it's called you won't find the relevant documentation let alone be able to try it for yourself and appreciate what it's for.

petdance · on Oct 24, 2021

The "print lines by number" was there because earlier versions of ack had a `--line=N` feature, where you could say "ack --line=15-18" and print those four lines. I dropped it because it was hardly any better than using sed.

If you've got suggestions on improvements, please submit an issue. I'd love to hear them.

https://github.com/beyondgrep/website/issues

blablabla123 · on Oct 24, 2021

Or "Limit length of output lines"... how about

   $ grep ... | head -n ...

I think this feature comparison misses how grep is supposed to be used. (See also "Pipe output through a pager or other command")

waynesonfire · on Oct 24, 2021

I agree. I use grep in conjunction with find and parallel to achieve a number of features not native to grep that are built-in to these other tools.

I prefer tools designed with does one thing well philosophy. It lets me scale my knowledge. I can solve many problems with find and parallel not supported by these grep clones.

burntsushi · on Oct 24, 2021

ripgrep degrades just fine to a normal grep tool. And you can use it in `find` pipelines.

> I prefer tools designed with does one thing well

This is pretty unlikely. For example, you probably use a grep tool that will also do recursive directory traversal for you. It probably even has flags for defining filters on that traversal. Why use such a tool when `find` already does recursive directory traversal for you?

petdance · on Oct 24, 2021

"grep | head" doesn't limit the length of output lines, but "grep | cut" would.

However, ack and ripgrep's default unpiped output is grouped by file, and if you pipe the output, it doesn't do the grouped output.

The idea of "supposed to be used" is also different for ack than it is for grep. ack is specifically less of a general-use tool than grep. It's meant for searching source code. This is also why I have never said that ack is a replacement for grep.

vluft · on Oct 24, 2021

that limits the number of output lines, not the length.

dotancohen · on Oct 24, 2021

You can update the page with pull requests to this repo: https://github.com/beyondgrep/website