I have patterns.txt file and I would like to remove all exact matches of patterns from FILE.txt. The FILE.txt is the following:
word1 word2
word3 word4
word5 word6
The pattern file contains:
word1
word6
The expected output is:
word2
word3 word4
word5
The command below removes the whole row where there is an exact match. How can I only remove the exact match from a line without removing the whole line? I don't want to use for-loops to achieve this.
cat FILE.txt | grep -wvf pattern.txt
CodePudding user response:
You may try this awk
:
awk 'FNR == NR {pats[$1]; next} {more=0; for (i=1; i<=NF; i) if (!($i in pats)) printf "%s", (more ? OFS : "") $i; print ""}' patterns.txt file
word2
word3 word4
word5
A more readable version:
awk '
FNR == NR {
pats[$1]
next
}
{
more = 0
for (i=1; i<=NF; i)
if (!($i in pats))
printf "%s", (more ? OFS : "") $i
print ""
}' patterns.txt file
CodePudding user response:
With sed:
re=$(tr '\n' '|' < patterns.txt)
sed -r "s/$re//; s/^[[:space:]]*//" file
word2
word3 word4
word5
Note: Make sure patterns.txt
does not have a trailing new line or extra new lines since |
will end up in each of those positions.
CodePudding user response:
In order to split the text with a word (string) as a delimiter, you can use awk.
awk -F 'word' '{print $1;print $2}' file.txt
In case you want to display only what is after the delimiter then it would be:
awk -F 'word' '{print $2}' file.txt
In order to change the pattern continuously then you might have to create a loop.
CodePudding user response:
My first thought was to do just what @anubhava did. Then I thought that perl might be good for that: perl has good capabilities for filtering lists. The problem is that perl doesn't have an FNR variable. But I played around with a2p
and came up with this:
perl -lane '
$FNR = $. - $FNRbase;
if ($. == $FNR) {
$ignore{$F[0]} = 1;
} else {
print join " ", grep {not exists $ignore{$_}} @F;
}
} continue {
$FNRbase = $. if eof
' pattern.txt FILE.txt