Home > OS >  Unable to apply where clause properly in python panda data frame
Unable to apply where clause properly in python panda data frame


Hi i have following data frame

S.No   Description    amount
1      a, b, c        100
2      a, c           50
3      b, c           80
4      b, d           90
5      a              150

I want to extract only values of 'a' forexample

expected answer:

Description   amount
a             100
a             50
a             150

and sum them up as

Description   amount
a             300

But i am getting this answer:

Description   amount
1      a        100
2      a        50
3      nan      nan
4      nan      nan
5      a        150

please guide me how to properly use where clause on panda's dataframes.


filter = new_df ["Description"] =="a"
new_df.where(filter, inplace = True)
print (new_df)

CodePudding user response:

Use df.assign, Series.str.split, df.explode, df.query and Groupby.sum:

In [703]: df_a = df.assign(Description=df.Description.str.split(',')).explode('Description').query('Description == "a"')

In [704]: df_a
   S.No Description  amount
0     1           a     100
1     2           a      50
4     5           a     150

In [706]: df_a.groupby('Description')['amount'].sum().reset_index()
  Description  amount
0           a     300

Or as a one-liner:

  .query('letters == "a"')\
  .groupby('letters', as_index=False)['amount'].sum()

CodePudding user response:

Here you go:

In [3]: df["Description"] = df["Description"].str.split(", ")

In [4]: df.explode("Description").groupby("Description", as_index=False).sum()[["Description", "amount"]]
  Description  amount
0           a     300
1           b     270
2           c     230
3           d      90

This allows you to get all the sums by each description, not just the 'a' group.

CodePudding user response:

we are calculating the total amount of a description above. count the total number of 'a' descriptions.

  • Related