Home > Enterprise >  Selecting Rows Based On Specific Condition In Python Pandas Dataframe
Selecting Rows Based On Specific Condition In Python Pandas Dataframe

Time:10-20

So I am new to using Python Pandas dataframes.

I have a dataframe with one column representing customer ids and the other holding flavors and satisfaction scores that looks something like this.

Click to view data frame

Although each customer should have 6 rows dedicated to them, Customer 1 only has 5. How do I create a new dataframe that will only print out customers who have 6 rows?

I tried doing: df['Customer No'].value_counts() == 6 but it is not working.

CodePudding user response:

Here is one way to do it

if you post data as a code (preferably) or text, i would be able to share the result

# create a temporary column 'c' by grouping on Customer No
# and assigning count to it using transform
# finally, using loc to select rows that has a count eq 6

(df.loc[df.assign(
    c=df.groupby(['Customer No'])['Customer No']
    .transform('count'))['c'].eq(6]
)
  • Related