Home > Net >  Group by calculation pandas
Group by calculation pandas

Time:09-16

I have a dataframe after applying groupby:

category | item
------------------
A        | a_item1
         | a_itme2
         | a_item3
------------------
B        | b_item1
         | b_item2
------------------

On this, I want to add a new column with the calculation: 10 / (no of items per category). For the example data, this would be:

category | item   |  value
----------------------------
A        | a_item1|   3.33
         | a_itme2|   3.33
         | a_item3|   3.33
----------------------------
B        | b_item1|   5
         | b_item2|   5
-----------------------------

How can this be done?

CodePudding user response:

Use Series.value_counts with Series.map:

df['value'] = 10 / df['category'].map(df['category'].value_counts())  

Or:

df['value'] = df['category'].map(df['category'].value_counts()).rdiv(10)

CodePudding user response:

You can use groupby together with transform:

df['value'] = 10 / df.groupby('category')['item'].transform('count')

CodePudding user response:

You can use the pandas apply function for dataframes.

Define the function that you want to apply on each row:

def get_value(s: pd.Series):
    vc = df['category'].value_counts()
    return 10/vc[s['category']]

Use apply on each row:

df['value'] = df.apply(get_value, axis=1)
df

#   category    item     value
# 0        A    a_item1  3.333333
# 1        A    a_item2  3.333333
# 2        A    a_item3  3.333333
# 3        B    b_item1  5.000000
# 4        B    b_item2  5.000000

You can also pre-compute the values counts and add them as arguments to your apply function.

  • Related