I have a dataset / CSV with hourly prices in columns for each day in the rows. It looks something like this:
Date | 01:00:00 | 02:00:00 | ... | 00:00:00 |
---|---|---|---|---|
01.01.2019 | 348,87 | 340,83 | ... | 343,38 |
02.01.2019 | ... | ... | ... | ... |
I would need the dataset to be like this:
Date | Price |
---|---|
01.01.2019 01:00:00 | 348,87 |
01.01.2019 02:00:00 | 343,38 |
... | ... |
02.01.2019 00:00:00 | ... |
And all the way to 01.01.2022.
I'm using pandas dataframes in Python. Could anyone help me with this?
RE:
df1 = pd.read_csv('Hourly_prices1.csv', delimiter = ';', index_col = ['Date'])
So basically, I want the index row to contain the price each hour on each day, going chronologically from 01.01.2019 01.00.00 all until 01.01.2022 00.00.00.
I need this to create a time series analysis, and amongst other things plot daily changes in the price of each hour of each day.
CodePudding user response:
Assuming your dataset is df, Please try this:
hours_ls = df.columns[1:]
df['Date'] = df['Date'].astype(str)
df_new = pd.DataFrame()
for date in df['Date'].values:
price_ls = []
date_hr = []
for hour in hours_ls:
date_hr.append(date ' ' str(hour))
price_ls.append(df[df['Date']==date][hour].iloc[0])
df_new = df_new.append(pd.DataFrame(data={'Date':date_hr,'Price':price_ls}))
df_new will be the formatted dataframe required