Here's the dataframe that I'm working with:
| Name | Vehicle Types Owned |
| -------- | -------------- |
| Bob | [Car, Bike] |
| Billy | [Car, Truck, Train] |
| Rob | [Plane, Train, Boat] |
| Sally | [Bike, Boat] |
I am looking for something like this
| Name | Car | Bike | Truck | Train | Plane | Boat |
| ---- | ----| --- | --- | --- | --- | --- |
| Bob | 1 | 1 | 0 | 0 | 0 | 0 |
| Billy| 1 | 0 | 1 | 1 | 0 | 0 |
| Rob | 0 | 0 | 0 | 1 | 1 | 1 |
| Sally| 0 | 1 | 0 | 0 | 0 | 1 |
The original dataframe looked like this, in case that'd be more useful to work with:
| Name | Vehicle Type Owned | Num Wheels | Top Speed |
| -------- | -------------- | --------- | --------- |
| Bob | Car | 4 | 200 mph |
| Bob | Bike | 2 | 20 mph |
| Billy | Car | 4 | 220 mph |
| Billy | Truck | 8 | 100 mph |
| Billy | Train | 80 | 86 mph |
| Rob | Plane | 3 | 600 mph |
| Rob | Train | 80 | 98 mph |
| Rob | Boat | 3 | 128 mph |
| Sally | Bike | 2 | 34 mph |
| Sally | Boat | 3 | 78 mph |
I'm using pandas.
CodePudding user response:
Try explode
then crosstab
df = df.explode('Vehicle Types Owned')
df_ = pd.crosstab([df['Name']], df['Vehicle Types Owned']).reset_index().rename_axis(None, axis=1)
print(df_)
Name Bike Boat Car Plane Train Truck
0 Billy 0 0 1 0 1 1
1 Bob 1 0 1 0 0 0
2 Rob 0 1 0 1 1 0
3 Sally 1 1 0 0 0 0
CodePudding user response:
From the original dataframe, you can use pivot_table
:
df.assign(count=1).pivot_table(index='Name', columns='Vehicle Type Owned', values='count', fill_value=0)
Result:
Vehicle Type Owned Bike Boat Car Plane Train Truck
Name
Billy 0 0 1 0 1 1
Bob 1 0 1 0 0 0
Rob 0 1 0 1 1 0
Sally 1 1 0 0 0 0