Home > database >  Add a column that is the average of another column in pandas
Add a column that is the average of another column in pandas

Time:10-21

I have a df that looks something like this:

{'time': {0: 0.00772,
  1: 0.0210220000000669,
  2: 0.0342939999999823,
  3: 0.0525870000000674,
  4: 0.0649490000000066},
 'time_rounded': {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0},
 'x': {0: -0.66, 1: -0.65, 2: -0.69, 3: -0.68, 4: -0.65}}

image of df.head()

The column time_rounded is the column time rounded up, and it comes from the formula:

df["time_rounded"] = df["time"].apply(np.ceil)

and what I'm trying to do is a column called x_avg, that returns the average of all x values where time_rounded equals its correspondent value.

In simpler terms, if time_rounded is 1, x_avg should be the average of all x values where time_rounded is 1.

I have tried this several ways, and I have figured out that

df["x"][df["time_rounded"] == 1].mean()

returns the value I was expecting, but I want to do this for every possible value, not just 1, because my time_rounded column goes from 1 to 5683, the example I gave is just the head of the df.

My closest try was:

df["x_avg"] = df["x"][df["time_rounded"]].mean()

but this returns the same number throughout the whole column (I don't know why)

Thank you in advance for the help!

CodePudding user response:

Your data is bit hard to demonstrate the solution so allow me to change the sample input:

df = pd.DataFrame(
    {
        "time": [1.1, 1.1, 2.1, 2.2, 3.1],
        "x": [1, 2, 3, 4, 5],
    }
)
df["time_rounded"] = np.ceil(df["time"])

Code:

df["x_avg"] = df.groupby("time_rounded")["x"].transform("mean")

# df
   time  x  time_rounded  x_avg
0   1.1  1           2.0    1.5
1   1.1  2           2.0    1.5
2   2.1  3           3.0    3.5
3   2.2  4           3.0    3.5
4   3.1  5           4.0    5.0
  • Related