Home > Blockchain >  Find patterns from one file in another file
Find patterns from one file in another file

Time:05-03

I'm trying to find patterns from one file in another file.

The pattern in file one looks something like this:

ENSG00000203875.13
ENSG00000262691.1
ENSG00000254911.3

File two contains:

ENSG00000203875.13 aa aaa bbb cc
ENSG00000227782.2
ENSG00000229582.3
ENSG00000241769.7
ENSG00000245904.4
ENSG00000254823.2
ENSG00000254911.3 cc ccc ccc
ENSG00000260213.6
ENSG00000260997.1
ENSG00000261799.1
ENSG00000262691.1 bbb bbb bbb
ENSG00000267249.1
ENSG00000270012.1
ENSG00000270091.1
ENSG00000270361.1
ENSG00000271533.1
ENSG00000271833.1
ENSG00000271870.1
ENSG00000272379.1
ENSG00000272631.1
ENSG00000273066.5
ENSG00000273855.1
ENSG00000278966.2
ENSG00000279332.1
ENSG00000279407.1
ENSG00000279616.1
ENSG00000279684.1
ENSG00000279835.1
ENSG00000286181.1
ENSG00000286986.1
ENSG00000287817.1

I'm trying to find only

ENSG00000203875.13 aa aaa bbb cc
ENSG00000254911.3 cc ccc ccc
ENSG00000262691.1 bbb bbb bbb

as output. I'm pretty sure grep -f file_one.txt file_two.txt should do the job, but instead I just get the content of file_two as output. I don't know what mistake I'm making. Can anyone point it out?

CodePudding user response:

I'd do something like:

for i in $(cat file_one.txt); do grep -i $i file_two.txt; done
ENSG00000203875.13 aa aaa bbb cc
ENSG00000262691.1 bbb bbb bbb
ENSG00000254911.3 cc ccc ccc

CodePudding user response:

You might consider using an awk approach, keeping track of the values of the first column of file_one.txt in array a, and then check of the value of the first column of file_two.txt is present in the keys of the array:

awk 'NR==FNR {a[$0]; next} $1 in a' file_one.txt file_two.txt

Output

ENSG00000203875.13 aa aaa bbb cc
ENSG00000254911.3 cc ccc ccc
ENSG00000262691.1 bbb bbb bbb

Another option using grep:

grep -f file_one.txt file_two.txt
  • Related