I have two text files with different outputs. I would like to find the common lines by comparing these two files. This is just an example but I'm finding common values. file2.txt contains 2 columns.
file1.txt contains:
12XVGAS4RJQ3wZopCc7bvjRSjHBrRR9bmw
12XVGsHANa9s4Szkmk73nTC5vTJHdFfx7w
12XVGwB6c72mmQCqEwCQtbuKmStw5RqW3X
12XVHEx5yorWhjxzFHMBW1ynPVCNwWfiDR
19vLAtK2PivKYB1ZT1J7dykw3rYga4SoVu
file2.txt contains:
125jHr5Gu4frTE3vqqf7w826wAGbvwUbo2 300
125JHs2AGKNuiSe7LGhVXEe4p6pasXiVme 100
12XVGwB6c72mmQCqEwCQtbuKmStw5RqW3X 900
12XVHEx5yorWhjxzFHMBW1ynPVCNwWfiDR 1000
19vLAtK2PivKYB1ZT1J7dykw3rYga4SoVu 0.93
This command will not work
comm -12 <(sort file1.txt) <(sort file2.txt)
I would like to extract the common values and have an output of
12XVHEx5yorWhjxzFHMBW1ynPVCNwWfiDR 1000
19vLAtK2PivKYB1ZT1J7dykw3rYga4SoVu 0.93
How will I get this output?
CodePudding user response:
Use join
:
join <(sort file1.txt) <(sort file2.txt)
Output:
12XVGwB6c72mmQCqEwCQtbuKmStw5RqW3X 900
12XVHEx5yorWhjxzFHMBW1ynPVCNwWfiDR 1000
19vLAtK2PivKYB1ZT1J7dykw3rYga4SoVu 0.93
CodePudding user response:
Assuming that, like in the example you provided, neither file has duplicate first field values then using any awk:
$ awk 'NR==FNR{a[$1]; next} $1 in a' file1.txt file2.txt
12XVGwB6c72mmQCqEwCQtbuKmStw5RqW3X 900
12XVHEx5yorWhjxzFHMBW1ynPVCNwWfiDR 1000
19vLAtK2PivKYB1ZT1J7dykw3rYga4SoVu 0.93