I have a pandas data frame with data like this...
df0
Customer | Product | Price |
---|---|---|
A | x | $12 |
B | y | $15 |
B | y | $10 |
C | x | $19 |
C | y | $9 |
D | x | $16 |
D | y | $8 |
D | y | $10 |
I need to separate this data set into two data frames:
One data frames where customers only have one product. Another data frame where customers have multiple products like this...
df1
Customer | Product | Price |
---|---|---|
A | x | $12 |
B | y | $15 |
B | y | $10 |
df2
Customer | Product | Price |
---|---|---|
C | x | $19 |
C | y | $9 |
D | x | $16 |
D | y | $8 |
D | y | $10 |
I then will do some analysis on prices as a follow-on step. I know how to filter on one column but I can't figure out a filter which needs to use both columns like this. Would anyone please help with the code which would split the data into two data frames? Many thanks
CodePudding user response:
If you use
df0.Product.groupby(df0.Customer).transform('nunique')
Will return the number of unique products for the customer in the row. So
df0[df0.Product.groupby(df0.Customer).transform('nunique') > 1]
will create a DataFrame with all those with more than one product, for example.