I am trying to use column values of Type in the CSV table below as Rows of a new table.
Old Table
Fruits | Type |
---|---|
Apple | High |
Orange | Medium |
Grape | Low |
Mango | High |
Apple | Medium |
Grape | Info |
I tried few options with pivot_table, groupby like below but couldn't get like the new table I would like to get.
df_tab= df.pivot_table(values=df[Type].value_counts(), index=df.set_index('Fruits'), columns='Type', aggfunc='sum')
New Table:
Fruit | High | Medium | Low | Info |
---|---|---|---|---|
Apple | 1 | 1 | 0 | 0 |
Orange | 0 | 1 | 0 | 0 |
Grape | 0 | 0 | 1 | 1 |
Mango | 1 | 0 | 0 | 0 |
CodePudding user response:
You can create 3 empty lists and loop through the type column and fill these lists, then add them to the data frame.
Medium = []
High = []
Low = []
info = []
for row in df["Type"]:
if row == "Medium":
Medium.append(1)
High.append(0)
Info.append(0)
Low.append(0)
elif row == "High":
Medium.append(0)
High.append(1)
Info.append(0)
Low.append(0)
elif row == "info":
Medium.append(0)
High.append(0)
Info.append(1)
Low.append(0)
elif row == "Low":
Medium.append(0)
High.append(0)
Info.append(0)
Low.append(1)
df["Medium"] = Medium
df["High"] = High
df["Info"] = Info
df["Low"] = Low
CodePudding user response:
I used groupby
df_grp df.groupby(['Fruits', 'Type']).aggregate('size').unstack().fillna(0).reset_index()
output is
Type Fruits High Info Low Medium
0 Apple 1.0 0.0 0.0 1.0
1 Grape 0.0 1.0 1.0 0.0
2 Mango 1.0 0.0 0.0 0.0
3 Orange 0.0 0.0 0.0 1.0
you can delete columns name (Type) like this if you want
df_grp.columns = list(df_grp.columns)
output is
Fruits High Info Low Medium
0 Apple 1.0 0.0 0.0 1.0
1 Grape 0.0 1.0 1.0 0.0
2 Mango 1.0 0.0 0.0 0.0
3 Orange 0.0 0.0 0.0 1.0