I am using grep to count the occurrences of a particular string in my code, but grep is not counting the occurrences which span more than one line.
I am trying to find occurrences of (`
including the ones which look like
(
`
Basically, the backtick is in the next line.
I tried so far:
grep -roh -E "\(\s*\`" . | wc -l
But it doesn't count them. Even
grep -roh -E "\(\n" . | wc -l
this is giving 0.
What would be the solution to this?
CodePudding user response:
find -type f -exec cat {} | tr -d '[:space:]' | grep -oF '(`' | wc -l
find
catenates contents of all files into a streamtr
reads stream and strips whitespacegrep
outputs occurrences of the string (-o
is GNU extension)wc
counts them
CodePudding user response:
The following assumes the strings you want to count start with an opening parenthesis, followed by spaces and end with a backtick, with at most one newline in the spaces. We can use sed
(tested with GNU sed
) to remove the newlines before passing all this to grep
and wc
:
$ s='abc
text (
`
def
text (
`
ghi ( ` (` jkl
'
$ sed -Ez ':a;s/(.*)\([[:blank:]]*\n[[:blank:]]*`(.*)/\1\(`\2/g;ta' <<< "$s"
abc
text (`
def
text (`
ghi ( ` (` jkl
$ sed -Ez ':a;s/(.*)\([[:blank:]]*\n[[:blank:]]*`(.*)/\1\(`\2/g;ta' <<< "$s" |
grep -Eo '\(\s*`'
(`
(`
( `
(`
$ sed -Ez ':a;s/(.*)\([[:blank:]]*\n[[:blank:]]*`(.*)/\1\(`\2/g;ta' <<< "$s" |
grep -Eo '\(\s*`' | wc -l
4
The sed
script uses the -z
option to separate lines by NUL characters. It substitutes any of your string that contains a newline by just an opening parenthesis, followed by a backtick and loops as long as there are substitutions.
To apply this on all files under the current directory you will need find
to concatenate them and pipe to sed
:
$ find . -type f -exec cat {} \; |
sed -Ez ':a;s/(.*)\([[:blank:]]*\n[[:blank:]]*`(.*)/\1\(`\2/g;ta' |
grep -Eo '\(\s*`' | wc -l
1257