I have a dataframe (G) whose columns are “Client” and “TIV”.
I have another dataframe whose (B) columns are “Client”, “TIV”, “A”, “B”, “C”.
I want to select all rows from B whose clients are not in G. In other words, if there is a row in B whose Client also extsist in G then I want to delete it.
I did this:
x= B[B[‘Client’]!= G[‘Client’]
But it returned saying that “can only compare identically labeled Series Object”
I appriciate your help.
CodePudding user response:
You can use df.isin
combined with ~
operator:
B[~B.Client.isin(G.Client)]
CodePudding user response:
Maybe the following code snippet helps:
df1 = pd.DataFrame(data={'Client': [1,2,3,4,5]})
df2 = pd.DataFrame(data={'Client': [1,2,3,6,7]})
# Identify what Clients are in df1 and not in df2
clients_diff = set(df1.Client).difference(df2.Client)
df1.loc[df1.Client.isin(clients_diff)]
The idea is to filter df1
on all clients which are not in df2