Home > Blockchain >  why grep is not finding words starting with underscore
why grep is not finding words starting with underscore

Time:10-18

I have words like

MEdIa
media
MEDIA
mEdIa
_media_
_media
media_
ICP_MEDIA

in a file. i am trying to grep the keyword media from the below command

grep -irwE "media|*_media"

But grep can find only

MEdIa
media
MEDIA
mEdIa
_media

Not able to find _media_ , media_ ,ICP_MEDIA

CodePudding user response:

I'm pretty sure someone with better regex foo can provide a nicer solution, but this works for me for a selected set of values (see below):

cat file.txt  | grep -iwE "media|.*[\b_]media[\b_]*"
_media_
media
ICP_MEDIA

Values:

_media_
media
ICP_MEDIA
XXX_media_YYY
NOTMEDIA
NOT_MEDIAXX

CodePudding user response:

I've tryed this on te example you give:

   cat find | grep 'media'

and the resoult was this:

media
_media_
_media
media_

P.S find is the name of the file i put your examples in.

CodePudding user response:

To answer your question: Why is grep not finding all matches

-w, --word-regexp: Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and the underscore. This option has no effect if -x is also specified.

So the entry _media_ is not matched to media or *media_ for the following reasons:

  • _media_ is not a whole word match with respect to media as it misses the underscores
  • _media_ is not a whole word match with respect to *media_ as, in regular expressions, an asterisk at the beginning of a regex is just an asterisk and looses its special meaning. And since * is different from _, there is no match.
  • Related