Home > Blockchain >  How can I use git grep with regular expressions?
How can I use git grep with regular expressions?

Time:02-17

I have used git grep for years to search for fixed strings and haven't used it much for doing regular expression searches.

I have places in the code with non-localized strings. For example:

   JLabel label =  buildLabel("Alphabet");

In this case buildLabel() is an inherited utility method. There are also buildBoldLabel(), buildMultiLineLabel(), and buildTextArea().

So I would like to search my code for uses of these methods without a lookup for the localized string. The correct call should be:

   JLabel label =  buildLabel(getString("Alphabet"));

I am very familiar with regular expressions and I see that git grep supports Perl character classes. So I figured that it would be very easy:

$ git grep -P "buildLabel(\"\w \")"

This returns no results. So I tried it without the Perl extension.

$ git grep "buildLabel(\"[a-zA-Z_] \")"

Still ... no results. I verified that I could search with a fixed string.

$ git grep "buildLabel(\"Alphabet\")"

That returned the instance in the code that I already knew existed. However ...

$ git grep -P "buildLabel(\"Alphabet\")"

Returns no results.

I also tried changing the quote characters and got the same results.

$ git grep -P 'buildLabel("\w ")' ... no results

$ git grep -P 'buildLabel("Alphabet")' ... no results

$ git grep 'buildLabel("Alphabet")' ... 1 expected result

I tried on Linux with the same results.

UPDATE:

Thanks to @wiktor-stribiżew commenting that with PCRE the parens need to be escaped (I am always confused by that).

$ git grep -P 'buildLabel\("\w "\)' ... returns 1 expected result.

However, why don't these work?

$ git grep 'buildLabel("[a-zA-Z_] ")'

$ git grep 'buildLabel\("[a-zA-Z_] "\)'

$ git grep 'buildLabel\("[a-zA-Z_][a-zA-Z_]*"\)' (in case isn't implemented)


So what am I doing wrong with git grep? Or is it broken?

FYI: I am using git version 2.35.1 from Homebrew on macOS Big Sur.

CodePudding user response:

Regex vs. fixed string search

Please refer to the git grep help:

-G
--basic-regexp
Use POSIX extended/basic regexp for patterns. Default is to use basic regexp.

So, by default, git grep treats the pattern string as a POSIX BRE regex, not as a fixed string.

To make git grep treat the pattern as a fixed string you need -F:

-F
--fixed-strings
Use fixed strings for patterns (don’t interpret pattern as a regex).

Regex issues

You can enable PCRE regex syntax with -P option, and in that case you should refer to PCRE documentation.

In your git grep -P "buildLabel(\"\w \")", the parentheses must be escaped in order to be matched as literal parentheses, i.e. it should be git grep -P "buildLabel\(\"\w \"\)".

In git grep 'buildLabel("[a-zA-Z_] ")', you are using the POSIX BRE regex, and is parsed as a literal char, not as a one or more quantifier. You can use git grep 'buildLabel("[a-zA-Z_]\{1,\}")' in POSIX BRE though. If it is a GNU grep, you could use git grep 'buildLabel("[a-zA-Z_]\ ")' (not sure it works with git).

The git grep 'buildLabel\("[a-zA-Z_] "\)' does not work because \(...\) (escaped pair of parentheses) define a capturing group and do not thus match literal parentheses.

The git grep -e 'buildLabel\("[a-zA-Z_][a-zA-Z_]*"\)' is the same POSIX BRE, to make it a POSIX ERE, you need to use the -E option, git grep -E 'buildLabel\("[a-zA-Z_][a-zA-Z_]*"\)'. Or git grep -E 'buildLabel\("[a-zA-Z_] "\)', the unescaped is a quantifier in POSIX ERE.

Also, see What special characters must be escaped in regular expressions?

  • Related