Home > OS >  Dataframe Resample date value keeping 'prices'
Dataframe Resample date value keeping 'prices'

Time:10-13

here a sample my dataframe :

    date début  € / mois enerc  € / mois edf
0   2021-04-01  40.86   8.46
1   2021-04-10  40.86   8.46
2   2021-04-16  33.69   8.46
3   2021-06-10  33.69   8.46
4   2021-08-01  37.71   9.35
5   2021-08-10  37.74   9.35

I want to resample date for daily rule, but keeping the price from when they change, until they change again, for example :

2021-04-10  40.86   8.46
2021-04-11  40.86   8.46
2021-04-12  40.86   8.46
2021-04-13  40.86   8.46
2021-04-14  40.86   8.46
2021-04-15  40.86   8.46
2021-04-16  33.69   8.46
2021-04-17  33.69   8.46
2021-04-18  33.69   8.46

etc. I don't want to interpolate, just to copy and change at the good period.

CodePudding user response:

you can do the following:

  • find minimum and maximum dates
  • create an array of dates from min to max increasing day by day and convert it to a dataframe
  • merge with the original data frame how='letf'
  • fillna values using forward fill
from datetime import datetime , timedelta
import pandas as pd
cols = ["date début  €", "mois enerc  €", "mois edf"]
data = [["2021-04-01",  40.86,   8.46],
        ["2021-04-10",  40.86,   8.46],
        ["2021-04-16",  33.69,   8.46],
        ["2021-06-10",  33.69,   8.46],
        ["2021-08-01",  37.71,   9.35],
        ["2021-08-10",  37.74,   9.35]]

df = pd.DataFrame(data, columns=cols)
df["date début  €"] = pd.to_datetime(df["date début  €"])

# find  min and max values of date
start_date = df["date début  €"].min()
end_date = df["date début  €"].max()
number_of_days = (end_date - start_date).days

# create array of dates
date_list=[(start_date   timedelta(days=days)).strftime('%Y-%m-%d') for days in range(number_of_days)]
# convert array to dataframe
df2 = pd.DataFrame(date_list, columns=["date début  €"])
df2["date début  €"] = pd.to_datetime(df2["date début  €"])

# merge and forwardfill nans
df2.merge(df, on="date début  €", how='left').fillna(method='ffill')

output

10  2021-04-11  40.86   8.46
11  2021-04-12  40.86   8.46
12  2021-04-13  40.86   8.46
13  2021-04-14  40.86   8.46
14  2021-04-15  40.86   8.46
15  2021-04-16  33.69   8.46
16  2021-04-17  33.69   8.46
17  2021-04-18  33.69   8.46
18  2021-04-19  33.69   8.46
  • Related