Subset of columns from another data frame-CodePudding

I have a dataframe (G) whose columns are “Client” and “TIV”.

I have another dataframe whose (B) columns are “Client”, “TIV”, “A”, “B”, “C”.

I want to select all rows from B whose clients are not in G. In other words, if there is a row in B whose Client also extsist in G then I want to delete it.

I did this:

x= B[B[‘Client’]!= G[‘Client’]

But it returned saying that “can only compare identically labeled Series Object”

I appriciate your help.

CodePudding user response：

You can use df.isin combined with ~ operator:

B[~B.Client.isin(G.Client)]

CodePudding user response：

Maybe the following code snippet helps:

df1 = pd.DataFrame(data={'Client': [1,2,3,4,5]})
df2 = pd.DataFrame(data={'Client': [1,2,3,6,7]})
# Identify what Clients are in df1 and not in df2
clients_diff = set(df1.Client).difference(df2.Client)
df1.loc[df1.Client.isin(clients_diff)]

The idea is to filter df1 on all clients which are not in df2