So I have a dataframe on Python that has lots of variables happening at once and getting pulled off at the same time and I want to make them into a new dataframe thats grouped by time so for example its like:
Value | ID | Unit | Key | Date |
---|---|---|---|---|
10 | 10 | m/s^2 | Acceleration | 01/01/2000 12:00:01 |
On | 11 | String | Machine | 01/01/2000 12:00:01 |
2.0 | 12 | bar | Pressure | 01/01/2000 12:00:01 |
12 | 13 | m/s^2 | Acceleration | 01/01/2000 12:00:02 |
Off | 14 | String | Machine | 01/01/2000 12:00:02 |
1.5 | 15 | bar | Pressure | 01/01/2000 12:00:02 |
but I want to have it as:
Date | Acceleration | Pressure | Lid |
---|---|---|---|
01/01/2000 12:00:01 | 10 m/s^2 | 2.0 bar | On |
01/01/2000 12:00:02 | 12 m/s^2 | 1.5 bar | Off |
I can link the values togther but how do I go about grouping like that over multiple lines?
CodePudding user response:
IIUC, you need to rework your columns, then pivot
:
# merge value/unit except for "Machine"
df['value'] = np.where(df['Key'].eq('Machine'),
df['Value'],
df['Value'].astype(str) ' ' df['Unit'])
# replace "Machine" with "Lid"
df['col'] = df['Key'].replace({'Machine': 'Lid'})
# pivot using the new columns
df2 = (df
.pivot(index='Date', columns='col', values='value')
.rename_axis(columns=None).reset_index()
)
output:
Date Acceleration Lid Pressure
0 01/01/2000 12:00:01 10 m/s^2 On 2.0 bar
1 01/01/2000 12:00:02 12 m/s^2 Off 1.5 bar