Home > database >  Transpose and Reorder
Transpose and Reorder

Time:01-04

This is a sample of my df, consisting of temperature and rain (mm) per city:

Datetime Berlin_temperature Dublin_temperature London_temperature Paris_temperature Berlin_rain Dublin_rain London_rain Paris_rain
2022-01-01 10:00:00 24 24 24 24 10 10 10 10
2022-01-01 11:00:00 24 24 24 24 10 10 10 10
2022-01-01 12:00:00 24 24 24 24 10 10 10 10
2022-01-01 13:00:00 24 24 24 24 10 10 10 10

I want to achieve following output as a dataframe:

Datetime City Temperature Rainfall
2022-01-01 10:00:00 Berlin 24 10
2022-01-01 10:00:00 Dublin 24 10
2022-01-01 10:00:00 London 24 10
2022-01-01 10:00:00 Paris 24 10
2022-01-01 11:00:00 Berlin 24 10
2022-01-01 11:00:00 Dublin 24 10
2022-01-01 11:00:00 London 24 10
2022-01-01 11:00:00 Paris 24 10
2022-01-01 12:00:00 ... ... ...

At the moment I don't know how to achieve this by transposing or something similar. How would this be possible?

CodePudding user response:

Use DataFrame.stack with MultiIndex created by splitted columns with _ - but first convert Datetime to index by DataFrame.set_index:

df1 = df.set_index('Datetime')
df1.columns = df1.columns.str.split('_', expand=True)
df1 = df1.stack(0).rename_axis(['Datetime','City']).reset_index()
print (df1)
               Datetime    City  rain  temperature
0   2022-01-01 10:00:00  Berlin    10           24
1   2022-01-01 10:00:00  Dublin    10           24
2   2022-01-01 10:00:00  London    10           24
3   2022-01-01 10:00:00   Paris    10           24
4   2022-01-01 11:00:00  Berlin    10           24
5   2022-01-01 11:00:00  Dublin    10           24
6   2022-01-01 11:00:00  London    10           24
7   2022-01-01 11:00:00   Paris    10           24
8   2022-01-01 12:00:00  Berlin    10           24
9   2022-01-01 12:00:00  Dublin    10           24
10  2022-01-01 12:00:00  London    10           24
11  2022-01-01 12:00:00   Paris    10           24
12  2022-01-01 13:00:00  Berlin    10           24
13  2022-01-01 13:00:00  Dublin    10           24
14  2022-01-01 13:00:00  London    10           24
15  2022-01-01 13:00:00   Paris    10           24

CodePudding user response:

Using janitor.pivot_longer:

import janitor

out = df.pivot_longer(index='Datetime', names_to=('City', '.value'),
                      names_sep='_', sort_by_appearance=True)

Output:

               Datetime    City  temperature  rain
0   2022-01-01 10:00:00  Berlin           24    10
1   2022-01-01 10:00:00  Dublin           24    10
2   2022-01-01 10:00:00  London           24    10
3   2022-01-01 10:00:00   Paris           24    10
4   2022-01-01 11:00:00  Berlin           24    10
5   2022-01-01 11:00:00  Dublin           24    10
6   2022-01-01 11:00:00  London           24    10
7   2022-01-01 11:00:00   Paris           24    10
8   2022-01-01 12:00:00  Berlin           24    10
9   2022-01-01 12:00:00  Dublin           24    10
10  2022-01-01 12:00:00  London           24    10
11  2022-01-01 12:00:00   Paris           24    10
12  2022-01-01 13:00:00  Berlin           24    10
13  2022-01-01 13:00:00  Dublin           24    10
14  2022-01-01 13:00:00  London           24    10
15  2022-01-01 13:00:00   Paris           24    10
  • Related