i want to compare 2 csv files that is file 1 and file 2 on basis of column , if column of file 1 matches with column of file 2 then remove that entire row of file 1

Example of file 1

sr. no.,username,id
101,Berlin240,835070687
102,X_PSYCH_X,1271001789
103,xenoo369,570078204
104,xarat581,1665916522
105,xandy88,639040049

Example of file 2 :

sr. no.,username,id
101,Berlin240,835070687
103,xenoo369,570078204
105,xandy88,639040049

now comparing file2 and removing all rows in file 1 that matches with the column of file 1

Now the file1 looks like this

sr. no.,username,id
102,X_PSYCH_X,1271001789
104,xarat581,1665916522

CodePudding user response：

Check this out, there are some examples https://support.microsoft.com/en-us/office/vlookup-function-0bbc8083-26fe-4963-8ab8-93a18ad188a1

CodePudding user response：

solution:

import pandas as pd

df1 = pd.read_csv("df1.csv")
df2 = pd.read_csv("df2.csv")

print(df1)
print(df2)

df_diff = pd.concat([df1,df2]).drop_duplicates(keep=False)

print(df_diff)

df1

     sr. no.  username      id
0      101  Berlin240   835070687
1      102  X_PSYCH_X  1271001789
2      103   xenoo369   570078204
3      104   xarat581  1665916522
4      105    xandy88   639040049

df2

     sr. no.  username     id
0      101  Berlin240  835070687
1      103   xenoo369  570078204
2      105    xandy88  639040049

df_diff

     sr. no.  username      id
1      102  X_PSYCH_X  1271001789
3      104   xarat581  1665916522