I'd like to divide each group in a polars dataframe by its 50% quantile.
Not working code:
df.select(pl.col('Value')) / df.groupby('Group').quantile(.5, 'linear')
With the following dataframe
df = pl.DataFrame(
[
["A", "A", "A", "A", "B", "B", "B", "B"],
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0],
],
columns=["Group", "Value"],
)
I'd expect the following result
Group | Value |
---|---|
A | 0.4 |
A | 0.8 |
A | 1.2 |
A | 1.6 |
B | 0.769 |
B | 0.923 |
B | 1.077 |
B | 1.231 |
I'm also happy with a series as a result, as long as I can concat it back into the original dataframe again.
CodePudding user response:
You can use window function with over("Group")
instead of groupby
quantile = pl.col("Value").quantile(.5, 'linear').over("Group")
df.with_column(
pl.col('Value') / quantile
)
┌───────┬──────────┐
│ Group ┆ Value │
│ --- ┆ --- │
│ str ┆ f64 │
╞═══════╪══════════╡
│ A ┆ 0.4 │
│ A ┆ 0.8 │
│ A ┆ 1.2 │
│ A ┆ 1.6 │
│ B ┆ 0.769231 │
│ B ┆ 0.923077 │
│ B ┆ 1.076923 │
│ B ┆ 1.230769 │
└───────┴──────────┘