I have a datafram with interval days and count number of days:
Day | Count |
---|---|
88 days | 51 |
54 days | 48 |
94 days | 47 |
292 days | 43 |
291 days | 43 |
428 days | 1 |
406 days | 1 |
419 days | 1 |
397 days | 1 |
So I want create a new table in range days like example:
450 days - 350 days | 350 days - 250 days | 250 - 150 days | 150 days - 50 days | 50 days-0 days |
---|---|---|---|---|
4 | 86 | 0 | 146 | 0 |
i tried this code but error:
df.index = pd.to_timedelta(df.index.astype(str))
df = df.resample('100 days').sum()
ValueError: Invalid frequency: 100 days
CodePudding user response:
Try:
df["Day"] = df["Day"].str.split().str[0].astype(int)
tmp = pd.cut(df["Day"], [0, 50, 150, 250, 350, 450])
x = df.groupby(tmp)["Count"].sum().to_frame().T.reset_index(drop=True)
x.columns = [f"{c.right} days - {c.left} days" for c in x.columns]
print(x[x.columns[::-1]])
Prints:
450 days - 350 days 350 days - 250 days 250 days - 150 days 150 days - 50 days 50 days - 0 days
0 4 86 0 146 0
CodePudding user response:
The frequency that you specify is invalid. You should use '30D'
instead of '30 days'
.
For instance:
import pandas as pd
mydata = {'Day':['88 days','54 days','94 days','292 days','291 days','428 days','406 days','419 days','397 days'],'Count': [51, 48, 47, 43, 43, 1, 1, 1, 1] }
df=pd.DataFrame(data=mydata)
df.index = pd.to_timedelta(df['Day'].astype(str))
df.resample('30D', axis=0).sum()