I have a list of dataframe like this of 90 heat devices
data_list = [df0,df2, ... ,df89]
All of these dataframes in data_list
have the same features (= column names):
("timestamp", "outside_temperature", "heating_generation", "power_consumption", "hot_water_storage", etc..)
All dataframes have the index : timestamp
for the same period of time
I want to create now new dataframes which are having one feature but for all 90 heat devices
e.g for outside_temperature:
timestamp device_0, device_2, device_3 ,..., device_89
01.05.2022 00:10 15.03 14.39 15.69 ... 15.30
01.05.2022 00:15 14.94 14.20 15.30 ... 15.29
01.05.2022 00:20 14.94 14.05 15.29 ... 15.20
.
.
.
etc.
and that for all my features.
Any idea whats the best way to do so ? I was thinking about merging but couldn't find good advice, or do it by a for loops.
CodePudding user response:
If I followed your question correctly you could concat the selected columns by first using a comprehension then setting the columns names in two steps. This does assume that the timestamps are the same in all data frames stored in data_list.
For example the following would concatenate all outside_temperature columns, leaving the columns name the same:
feature_df = pd.concat([x['outside_temperture'] for x in data_list], axis=1)
and then you could rename the columns with something like the following:
feature_df.columns = [f'device_{i}' for i in range(len(data_list))]
CodePudding user response:
You can concat your dataframes and then transpose them using pd.pivot
function like in this small example
import pandas as pd
df1 = pd.DataFrame({
'timestamp': [pd.Timestamp(2022, 1,1), pd.Timestamp(2022, 1,2)],
'value':[1,2],
})
df2 = pd.DataFrame({
'timestamp': [pd.Timestamp(2022, 1,1), pd.Timestamp(2022, 1,4)],
'value':[3,4]
})
dfs = [df1, df2]
# add df tag
for (idx, df) in enumerate(dfs):
df['device'] = f'device_{idx}'
final_df = pd.concat([df1, df2])
pd.pivot(index = 'timestamp' , columns = 'device', values= 'value', data= final_df )