i want to compare 2 csv files that is file 1 and file 2 on basis of column , if column of file 1 matches with column of file 2 then remove that entire row of file 1
Example of file 1
sr. no.,username,id
101,Berlin240,835070687
102,X_PSYCH_X,1271001789
103,xenoo369,570078204
104,xarat581,1665916522
105,xandy88,639040049
Example of file 2 :
sr. no.,username,id
101,Berlin240,835070687
103,xenoo369,570078204
105,xandy88,639040049
now comparing file2 and removing all rows in file 1 that matches with the column of file 1
Now the file1 looks like this
sr. no.,username,id
102,X_PSYCH_X,1271001789
104,xarat581,1665916522
CodePudding user response:
Check this out, there are some examples https://support.microsoft.com/en-us/office/vlookup-function-0bbc8083-26fe-4963-8ab8-93a18ad188a1
CodePudding user response:
solution:
import pandas as pd
df1 = pd.read_csv("df1.csv")
df2 = pd.read_csv("df2.csv")
print(df1)
print(df2)
df_diff = pd.concat([df1,df2]).drop_duplicates(keep=False)
print(df_diff)
df1
sr. no. username id
0 101 Berlin240 835070687
1 102 X_PSYCH_X 1271001789
2 103 xenoo369 570078204
3 104 xarat581 1665916522
4 105 xandy88 639040049
df2
sr. no. username id
0 101 Berlin240 835070687
1 103 xenoo369 570078204
2 105 xandy88 639040049
df_diff
sr. no. username id
1 102 X_PSYCH_X 1271001789
3 104 xarat581 1665916522