I have a df that looks something like this:
{'time': {0: 0.00772,
1: 0.0210220000000669,
2: 0.0342939999999823,
3: 0.0525870000000674,
4: 0.0649490000000066},
'time_rounded': {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0},
'x': {0: -0.66, 1: -0.65, 2: -0.69, 3: -0.68, 4: -0.65}}
The column time_rounded is the column time rounded up, and it comes from the formula:
df["time_rounded"] = df["time"].apply(np.ceil)
and what I'm trying to do is a column called x_avg, that returns the average of all x values where time_rounded equals its correspondent value.
In simpler terms, if time_rounded is 1, x_avg should be the average of all x values where time_rounded is 1.
I have tried this several ways, and I have figured out that
df["x"][df["time_rounded"] == 1].mean()
returns the value I was expecting, but I want to do this for every possible value, not just 1, because my time_rounded column goes from 1 to 5683, the example I gave is just the head of the df.
My closest try was:
df["x_avg"] = df["x"][df["time_rounded"]].mean()
but this returns the same number throughout the whole column (I don't know why)
Thank you in advance for the help!
CodePudding user response:
Your data is bit hard to demonstrate the solution so allow me to change the sample input:
df = pd.DataFrame(
{
"time": [1.1, 1.1, 2.1, 2.2, 3.1],
"x": [1, 2, 3, 4, 5],
}
)
df["time_rounded"] = np.ceil(df["time"])
Code:
df["x_avg"] = df.groupby("time_rounded")["x"].transform("mean")
# df
time x time_rounded x_avg
0 1.1 1 2.0 1.5
1 1.1 2 2.0 1.5
2 2.1 3 3.0 3.5
3 2.2 4 3.0 3.5
4 3.1 5 4.0 5.0