On specific dates, a metric starting at 0 increases by a value. Given a set of non-continuous dates and values, is it possible to produce a column with metric?
Input - metric changes per day
date value
02-03-2022 00:00:00 10
03-03-2022 00:00:00 0
06-03-2022 00:00:00 2
10-03-2022 00:00:00 18
Output - metric calculated for continuous range of days (starting value = 0 unless change applies already on first day)
0 metric
0 2022-02-28 0
1 2022-03-01 0
2 2022-03-02 10
3 2022-03-03 10
4 2022-03-04 10
5 2022-03-05 10
6 2022-03-06 12
7 2022-03-07 12
8 2022-03-08 12
9 2022-03-09 12
10 2022-03-10 30
11 2022-03-11 30
12 2022-03-12 30
13 2022-03-13 30
Code example
import pandas as pd
df = pd.DataFrame({'date': ['02-03-2022 00:00:00',
'03-03-2022 00:00:00',
'06-03-2022 00:00:00',
'10-03-2022 00:00:00'],
'value': [10, 0, 2, 18]},
index=[0,1,2,3])
df2 = pd.DataFrame(pd.date_range(start='28-02-2022', end='13-03-2022'))
df2['metric'] = 0 # TODO
CodePudding user response:
Replace values in df2 from df by date, fill missing values with 0 and then cumsum
:
df['date'] = pd.to_datetime(df.date, format='%d-%m-%Y %H:%M:%S')
df2['metric'] = df2[0].map(df.set_index('date')['value']).fillna(0).cumsum()
df2
0 metric
0 2022-02-28 0.0
1 2022-03-01 0.0
2 2022-03-02 10.0
3 2022-03-03 10.0
4 2022-03-04 10.0
5 2022-03-05 10.0
6 2022-03-06 12.0
7 2022-03-07 12.0
8 2022-03-08 12.0
9 2022-03-09 12.0
10 2022-03-10 30.0
11 2022-03-11 30.0
12 2022-03-12 30.0
13 2022-03-13 30.0
CodePudding user response:
df.reindex
is useful for this. Then add df.fillna
and apply df.cumsum
.
import pandas as pd
df = pd.DataFrame({'date': ['02-03-2022 00:00:00',
'03-03-2022 00:00:00',
'06-03-2022 00:00:00',
'10-03-2022 00:00:00'],
'value': [10, 0, 2, 18]},
index=[0,1,2,3])
df['date'] = pd.to_datetime(df.date, format='%d-%m-%Y %H:%M:%S')
res = df.set_index('date').reindex(pd.date_range(
start='2022-02-28', end='2022-03-13')).fillna(0).cumsum()\
.reset_index(drop=False).rename(columns={'index':'date',
'value':'metric'})
print(res)
date metric
0 2022-02-28 0.0
1 2022-03-01 0.0
2 2022-03-02 10.0
3 2022-03-03 10.0
4 2022-03-04 10.0
5 2022-03-05 10.0
6 2022-03-06 12.0
7 2022-03-07 12.0
8 2022-03-08 12.0
9 2022-03-09 12.0
10 2022-03-10 30.0
11 2022-03-11 30.0
12 2022-03-12 30.0
13 2022-03-13 30.0