I would like to merge 2 or 3 row into 1 row like below.
Original chart:
CaseNr | AccType | Vehicle | Model | VehicleMass |
---|---|---|---|---|
A | 1 | PassengerCar | A1 | 1217 |
A | 1 | Train | Train | 99999 |
B | 2 | PassengerCar | B7 | 1400 |
B | 2 | Train | Train | 99999 |
C | 3 | PassengerCar | C2 | 1295 |
Modified chart that I want:
CaseNr | AccType | Vehicle_1 | Vehicle_2 | Model_1 | Model_2 | VehicleMass_1 | Vehicle Mass_2 | MainFac |
---|---|---|---|---|---|---|---|---|
A | 1 | PassengerCar | Train | A1 | Train | 1217 | 99999 | 5 |
B | 2 | PassengerCar | Train | B7 | Train | 1400 | 99999 | 6 |
C | 3 | PassengerCar | NaN | C2 | NaN | 1295 | NaN | 2 |
As you could notice the chart, the data field include vehicle crash information. Therefore, 1, 2, or 3 vehicles could be involved in one accident case.
I searched all merge
options but I could not find how to do this.
CodePudding user response:
Use pivot
to reshape your data. This is called reshaping to wide and not merging.
df1 = df.assign(name = df.groupby('CaseNr').cumcount() 1).pivot(['CaseNr', 'AccType'], 'name')
df1.columns = df1.columns.map(lambda x:f'{x[0]}_{x[1]}')
df1.reset_index()
CaseNr AccType Vehicle_1 ... Model_2 VehicleMass_1 VehicleMass_2
0 A 1 PassengerCar ... Train 1217.0 99999.0
1 B 2 PassengerCar ... Train 1400.0 99999.0
2 C 3 PassengerCar ... NaN 1295.0 NaN
If you have janitor
installed you can do:
import janitor
df.assign(name = df.groupby('CaseNr').cumcount() 1).pivot_wider(['CaseNr', 'AccType'], 'name')
CaseNr AccType Vehicle_1 ... Model_2 VehicleMass_1 VehicleMass_2
0 A 1 PassengerCar ... Train 1217.0 99999.0
1 B 2 PassengerCar ... Train 1400.0 99999.0
2 C 3 PassengerCar ... NaN 1295.0 NaN
CodePudding user response:
using cumcount
and .unstack
s = df.groupby(["CaseNr", "AccType"]).cumcount()
df.set_index([s 1, "CaseNr", "AccType"]).unstack(0)
# if you must flatten your multi index - better to have it imo.
df1.columns = [f"{x}_{y}" for x, y in df1.columns.to_flat_index()]
print(df1)
Vehicle_1 Vehicle_2 Model_1 Model_2 VehicleMass_1 VehicleMass_2
CaseNr AccType
A 1 PassengerCar Train A1 Train 1217.0 99999.0
B 2 PassengerCar Train B7 Train 1400.0 99999.0
C 3 PassengerCar NaN C2 NaN 1295.0 NaN