I have two different DataFrames and i wanna get a new DataFrame with the indexes that are different.
DataFrame A:
Name Color
Jony Blue
Mike Red
Joanna Green
DataFrame B:
Name Color
Jony Blue
Mike Red
DataFrame Output:
Name Color
Joanna Green
How can i do to get this DataFrame Output?
CodePudding user response:
Maybe using 'symmetric difference' on a set that you convert to a Series of df ?
dfc=pd.DataFrame()
dfc['Name Color']=pd.Series(list(set(dfa['Name Color']).symmetric_difference(set(dfb['Name Color']))))
CodePudding user response:
One option is to outer-merge
with the indicator
parameter set to True. Then the common rows will be flagged "both" and since you don't want the common rows, you filter them out:
out = df1.merge(df2, how='outer', indicator=True).query('_merge!="both"').drop(columns='_merge')
Output:
Name Color
2 Joanna Green
CodePudding user response:
Using drop_duplicates
import pandas as pd
dataA = {'Name':['Jony', 'Mike', 'Joanna'], 'Color':['Blue', 'Red', 'Green']}
dataB = {'Name':['Jony', 'Mike'], 'Color':['Blue', 'Red']}
dfA = pd.DataFrame(dataA)
dfB = pd.DataFrame(dataB)
df = pd.concat([dfA, dfB]).drop_duplicates(keep=False, ignore_index=True)
CodePudding user response:
Assuming the Name Column is the index in both dataframes:
df_a = pd.DataFrame({'Color': {'Jony': 'Blue', 'Mike': 'Red', 'Joanna': 'Green'}})
df_a = df_a.rename_axis('Name')
df_b = pd.DataFrame({'Color': {'Jony': 'Blue', 'Mike': 'Red'}})
df_b = df_b.rename_axis('Name')
df = pd.concat([df_a[~df_a.index.isin(df_b.index)], df_b[~df_b.index.isin(df_a.index)]])
print(df)
Output:
Color
Name
Joanna Green