I have a raw dataframe shown like below, it's not a normalized table form due to the Min, Max and Unit row.
Name FIELD_A Field_B FIELD_C FIELD_D
Min NONE 100 -1.0 0
Max NONE 500 0.75 500
Unit NONE ms % cm
1 2020/01/21 150 0.8 200
2 2020/01/22 160 0.7 210
3 2020/01/23 170 0.6 220
4 2020/01/24 180 0.5 230
5 2020/01/25 190 0.4 240
The dataframe could be create with following code:
df_raw = pd.DataFrame({
'Name':['FIELD_A', 'FIELD_B', 'FIELD_C', 'FIELD_D'],
'Min':[None, 100, -1.0, 0],
'Max':[None, 500, 0.75, 500],
'Unit':[None, 'ms', '%', 'cm'],
'1':['2020/01/21',150,0.8,200],
'2':['2020/01/22',160,0.7,210],
'3':['2020/01/23',170,0.6,220],
'4':['2020/01/24',180,0.5,230],
'5':['2020/01/25',190,0.4,240]
}).transpose()
Then I'm going to flatten this grid-like dataframe into a normalized table form shown like below:
No. FIELD VALUE MIN MAX UNIT
1 A 2020/01/21 None NONE NONE
2 A 2020/01/22 None NONE NONE
3 A 2020/01/23 None NONE NONE
4 A 2020/01/24 None NONE NONE
5 A 2020/01/25 None NONE NONE
1 B 150 100 500 ms
2 B 160 100 500 ms
3 B 170 100 500 ms
4 B 180 100 500 ms
5 B 190 100 500 ms
1 C 0.8 -1.0 0.75 %
2 C 0.7 -1.0 0.75 %
3 C 0.6 -1.0 0.75 %
4 C 0.5 -1.0 0.75 %
5 C 0.4 -1.0 0.75 %
1 D 200 0 500 cm
2 D 210 0 500 cm
3 D 220 0 500 cm
4 D 230 0 500 cm
5 D 240 0 500 cm
Currently, I done this with for idx, value in enumerate(raw_df[FIELD])
inside for FIELD in raw_df.columns[1:]
. But it seems the operation is not effective exactly...
CodePudding user response:
You can transpose, set_index
on the first lines to create a MultiIndex, then melt
to reshape. The rest is just reworking a bit the output,
(df_raw.T
.rename(columns={'Name': 'Field'})
.set_index(['Field','Min','Max','Unit'])
.T
.melt()
.assign(Field=lambda d: d['Field'].str.split('_').str[-1])
)
Output:
Field Min Max Unit value
0 A NaN NaN NaN 2020/01/21
1 A NaN NaN NaN 2020/01/22
2 A NaN NaN NaN 2020/01/23
3 A NaN NaN NaN 2020/01/24
4 A NaN NaN NaN 2020/01/25
5 B 100.0 500.00 ms 150
6 B 100.0 500.00 ms 160
7 B 100.0 500.00 ms 170
8 B 100.0 500.00 ms 180
9 B 100.0 500.00 ms 190
10 C -1.0 0.75 % 0.8
11 C -1.0 0.75 % 0.7
12 C -1.0 0.75 % 0.6
13 C -1.0 0.75 % 0.5
14 C -1.0 0.75 % 0.4
15 D 0.0 500.00 cm 200
16 D 0.0 500.00 cm 210
17 D 0.0 500.00 cm 220
18 D 0.0 500.00 cm 230
19 D 0.0 500.00 cm 240