I have 2 data frames that look like this
Df1
City Code ColA Col..Z
LA LAA
LA LAB
LA LAC
Df2
Code ColA Col..Z
LA LAA
NY NYA
CH CH1
What I'm trying to do have the result of
df3
Code ColA Col..Z
NY NYA
CH CH1
Normally I would loop through each row in df2 and say:
Df3 = If df2.row['Code'] in df1 then drop it.
But I want to find a pythonic pandas way to do it instead of looping through the dataframe. I was looking at examples using joins or merging but I cant seem to work it out.
CodePudding user response:
To keep only the different items in df2
based on the code
column, you can do something like this, using drop_duplicates
:
df2[df2.code.isin(
# the different values in df2's 'code' column
pd.concat([df1.code, df2.code]).drop_duplicates(keep=False)
)]
CodePudding user response:
This Df3 = If df2.row['Code'] in df1 then drop it. translates to
df3 = df2[~df2['Code'].isin(df1['City'] ]
CodePudding user response:
df2 = pd.read_clipboard()
df2
df1.compare(df2).drop('self', axis=1, level=1).droplevel(1, axis=1)
(And I'm making an assumption you had a typo in your dataframes with the City col missing from df2?)