I have two files:
file1:
1
2
3
4
5
6
file2:
ab|1|234|ks|
fg|6|567|fg|
fg|19|576|ik|
sd|3|879|jh|
Now i have to search the data from file1 in file2, and remove the rows containing it in column 2.
like here, the output should be :
fg|19|576|ik|
CodePudding user response:
awk -F '|' 'NR==FNR { a[$0]; next } !($2 in a)' file1 file2
Explanation:
-F '|'
set field separator for inputNR==FNR
matches the first file only (total record number equals file record number)a[$0]
adds the input line as a valid index to arraya
next
skips further processing of the input line, so everything after this will only be done for the second (to last) file(s)!($2 in a)
condition: second field is not a valid index in arraya
, with implicit actionprint
CodePudding user response:
You can use rquery if file1 is not too large.
[ rquery]$ cat file1
1
2
3
4
5
6
[ rquery]$ cat file2
ab|1|234|ks|
fg|6|567|fg|
fg|19|576|ik|
sd|3|879|jh|
[ rquery]$ arr=`echo $(cat file1)|sed 's/ /,/g'`
[ rquery]$ ./rq -q "parse /([^|]*)\|(?P<key>[^|]*)\|([^\n]*)*/ | filter key noin (${arr})" file2
fg|19|576|ik|
Rquery can be downloaded from here https://github.com/fuyuncat/rquery/releases