How to check for data from one file in another and remove the row with that data in second file in l-CodePudding

I have two files:

file1:

file2:

ab|1|234|ks|
fg|6|567|fg|
fg|19|576|ik|
sd|3|879|jh|

Now i have to search the data from file1 in file2, and remove the rows containing it in column 2.

like here, the output should be :

fg|19|576|ik|

CodePudding user response：

awk -F '|' 'NR==FNR { a[$0]; next } !($2 in a)' file1 file2

Explanation:

-F '|' set field separator for input
NR==FNR matches the first file only (total record number equals file record number)
a[$0] adds the input line as a valid index to array a
next skips further processing of the input line, so everything after this will only be done for the second (to last) file(s)
!($2 in a) condition: second field is not a valid index in array a, with implicit action print

CodePudding user response：

You can use rquery if file1 is not too large.

[ rquery]$ cat file1
1
2
3
4
5
6
[ rquery]$ cat file2
ab|1|234|ks|
fg|6|567|fg|
fg|19|576|ik|
sd|3|879|jh|
[ rquery]$ arr=`echo $(cat file1)|sed 's/ /,/g'`
[ rquery]$ ./rq -q "parse /([^|]*)\|(?P<key>[^|]*)\|([^\n]*)*/ |  filter key noin (${arr})" file2
fg|19|576|ik|

Rquery can be downloaded from here https://github.com/fuyuncat/rquery/releases