Home > Software design >  Can I grep to find matching rows in 2 files using grep cut but then print back the rest of the row a
Can I grep to find matching rows in 2 files using grep cut but then print back the rest of the row a

Time:10-11

I am looking to find matches anywhere between 2 csv files, email addresses specifically. If I run this command via bash (I would like to figure out something via bash for now):

grep -w --file <(sort -u file1.csv) <(sort -u file2.csv| cut d',' -f2)

it works. But there is surrounding character data in the same row that I would like to print out either to stdin or to a new file eventually. How do I do this diff comparison and still preserve the row data for printing?

file1.csv consists of:
[email protected]

file2.csv consists of:
,[email protected],123456,"""ABC, Blah Blah."""

I would like to do my comparison but join the ABC, Blah Blah data back to the final result.

Thank you!

CodePudding user response:

rquery (https://github.com/fuyuncat/rquery/releases) can handle the content of multiple files like SQL query.

[ rquery]$ cat samples/email1.csv
[email protected]
[email protected]
[ rquery]$ cat samples/email2.csv
,[email protected],123456,"""ABC, Blah Blah."""
,[email protected],123456,"""123, Blah Blah."""
,[email protected],123456,"""CBA, Blah Blah."""
,[email protected],123456,"""BBB, Blah Blah."""
,[email protected],123456,"""CCC, Blah Blah."""
[ rquery]$ ./rq -q "p d/,/\"\"/|m @1 where @fileid=1|s @raw | f @fileid=2 and @2 = @r[1][1]" samples/email1.csv samples/email2.csv
,[email protected],123456,"""ABC, Blah Blah."""
,[email protected],123456,"""CBA, Blah Blah."""

CodePudding user response:

When trying to join table data on indexes, SQLite3 is a nice tool to have as it is capable of directly importing CSV datafiles.

Here is an example:

file1.csv:

[email protected]
[email protected]

file2.csv:

[email protected],123456,"""ABC, Blah Blah."""
[email protected],7684652,"Foo, bar, baz."
[email protected],424255,"Hello kitten"
[email protected],6578218,"This to be output"
#! /bin/sh

sqlite3 :memory: << 'EOF'
CREATE TABLE emails (
    email text NOT NULL PRIMARY KEY
);

CREATE TABLE contacts (
  email text NOT NULL PRIMARY KEY,
  phone text,
  description text
);

.mode csv
.import file1.csv emails
.import file2.csv contacts

SELECT c.*
FROM emails AS e
LEFT JOIN contacts AS c
  ON e.email = c.email;
EOF

Output from this program above:

[email protected],6578218,"This to be output"
[email protected],7684652,"Foo, bar, baz."

If your CSV files have column headers, then you can let SQLite3 handle tables creation for you:

#! /bin/sh

sqlite3 :memory: << 'EOF'
.mode csv
.headers on
.import file1.csv emails
.import file2.csv contacts

SELECT c.*
FROM emails AS e
LEFT JOIN contacts AS c
  ON e.email = c.email;
EOF
  • Related