Home > Enterprise >  grep remove exact matches from line without removing the whole line
grep remove exact matches from line without removing the whole line

Time:10-27

I have patterns.txt file and I would like to remove all exact matches of patterns from FILE.txt. The FILE.txt is the following:

word1 word2
word3 word4
word5 word6

The pattern file contains:

word1
word6

The expected output is:

word2
word3 word4
word5

The command below removes the whole row where there is an exact match. How can I only remove the exact match from a line without removing the whole line? I don't want to use for-loops to achieve this.

cat FILE.txt | grep -wvf pattern.txt

CodePudding user response:

You may try this awk:

awk 'FNR == NR {pats[$1]; next} {more=0; for (i=1; i<=NF;   i) if (!($i in pats)) printf "%s", (more   ? OFS : "") $i; print ""}' patterns.txt file

word2
word3 word4
word5

A more readable version:

awk '
FNR == NR {
   pats[$1]
   next
}
{
   more = 0
   for (i=1; i<=NF;   i)
      if (!($i in pats))
         printf "%s", (more   ? OFS : "") $i
   print ""
}' patterns.txt file

CodePudding user response:

With sed:

re=$(tr '\n' '|' < patterns.txt)
sed -r "s/$re//; s/^[[:space:]]*//" file
word2
word3 word4
word5 

Note: Make sure patterns.txt does not have a trailing new line or extra new lines since | will end up in each of those positions.

CodePudding user response:

In order to split the text with a word (string) as a delimiter, you can use awk.

awk -F 'word' '{print $1;print $2}' file.txt

In case you want to display only what is after the delimiter then it would be:

awk -F 'word' '{print $2}' file.txt

In order to change the pattern continuously then you might have to create a loop.

CodePudding user response:

My first thought was to do just what @anubhava did. Then I thought that perl might be good for that: perl has good capabilities for filtering lists. The problem is that perl doesn't have an FNR variable. But I played around with a2p and came up with this:

perl -lane '
    $FNR = $. - $FNRbase;
    if ($. == $FNR) {
      $ignore{$F[0]} = 1;
    } else {
      print join " ", grep {not exists $ignore{$_}} @F;
    }
  } continue {
    $FNRbase = $. if eof
' pattern.txt FILE.txt
  • Related