Home > Back-end >  Calculate column average row by row using pandas
Calculate column average row by row using pandas

Time:11-08

I have the following pandas DF:

    val
1   10
2   20
3   30
4   40
5   30

I want to get two output columns: avg and avg_sep

avg should be the average calculated row by row.

avg_sep should be the average calculated row by row until a certain condition (i.e. until row 3 I calculate one average, before row 3 I start calculating another average), my expected output is:

    val  avg  avg_sep
1   10   10   10
2   20   15   15
3   30   20   20
4   40   25   40
5   30   26   35

I know I can use df.mean(axis=0) to get the average of the column. But how can I get the expected output?

CodePudding user response:

From the discussion in the comments:

import pandas as pd
import numpy as np

# Building frame:
df = pd.DataFrame(
    data={"val": [10, 20, 30, 40, 30]},
    index=[1, 2, 3, 4, 5]
)

# Solution:
df["avg"] = df["val"].cumsum() / np.arange(1, 6) # or `/ df.index`
df.loc[:3, "avg_sep"] = df.loc[:3, "val"].cumsum() / np.arange(1, 4)
df.loc[4:, "avg_sep"] = df.loc[4:, "val"].cumsum() / np.arange(1, 3)
  • Related