Hi i have following data frame
S.No Description amount
1 a, b, c 100
2 a, c 50
3 b, c 80
4 b, d 90
5 a 150
I want to extract only values of 'a' forexample
expected answer:
Description amount
a 100
a 50
a 150
and sum them up as
Description amount
a 300
But i am getting this answer:
Description amount
1 a 100
2 a 50
3 nan nan
4 nan nan
5 a 150
please guide me how to properly use where clause on panda's dataframes.
Code:
filter = new_df ["Description"] =="a"
new_df.where(filter, inplace = True)
print (new_df)
CodePudding user response:
Use df.assign
, Series.str.split
, df.explode
, df.query
and Groupby.sum
:
In [703]: df_a = df.assign(Description=df.Description.str.split(',')).explode('Description').query('Description == "a"')
In [704]: df_a
Out[704]:
S.No Description amount
0 1 a 100
1 2 a 50
4 5 a 150
In [706]: df_a.groupby('Description')['amount'].sum().reset_index()
Out[706]:
Description amount
0 a 300
Or as a one-liner:
df.assign(letters=df['Description'].str.split(',\s'))\
.explode('letters')\
.query('letters == "a"')\
.groupby('letters', as_index=False)['amount'].sum()
CodePudding user response:
Here you go:
In [3]: df["Description"] = df["Description"].str.split(", ")
In [4]: df.explode("Description").groupby("Description", as_index=False).sum()[["Description", "amount"]]
Out[4]:
Description amount
0 a 300
1 b 270
2 c 230
3 d 90
This allows you to get all the sums by each description, not just the 'a'
group.
CodePudding user response:
we are calculating the total amount of a description above. count the total number of 'a' descriptions.