I have a dataframe that has two columns and I want to create a list containing all the values in the second column for the same value in column one.
If I have a dataframe that looks like:
Type | Item |
---|---|
Cars | Toyota |
Cars | Honda |
Cars | Tesla |
Fruits | Apple |
Fruits | Orange |
Countries | USA |
Countries | Mexico |
So I want to be either be able to divide this datafram into three separate df for Cars, Fruits, and Countries. Or I want to have a list for Cars, Fruits, and Countries that would like this:
Cars = ['Toyota', 'Honda', 'Tesla']
Fruits = ['Apple', 'Orange']
Countries = ['USA, 'Mexico']
This is just an example, my dataframe is huge so I want to have a function that does this without having to manually type in each Type. I tried looking up groupby function for pandas but don't think I was able to find how I can use it to do what I need to.
Any help is appreciated.
CodePudding user response:
You can try this :
dict_ = df.groupby('Type').agg(list).T.to_dict()
for key in dict_:
li_ = dict_.get(key).get("Item")
globals()[key] = li_
Also, you can use locals depends on your scope :
dict_ = df.groupby('Type').agg(list).T.to_dict()
for key in dict_:
li_ = dict_.get(key).get("Item")
locals()[key] = li_
You can get :
locals()["Cars"]
Out[1]: ['Toyota', 'Honda', 'Teska']
globals()["Cars"]
Out[2]: ['Toyota', 'Honda', 'Teska']
CodePudding user response:
My try to solve your question :)
import pandas as pd
df = pd.DataFrame({'Type': ['Cars', 'Cars', 'Cars', 'Fruits', 'Fruits', 'Countries', 'Countries'],
'Item': ['Toyota', 'Honda', 'Tesla', 'Apple', 'Orange', 'USA', 'Mexico']})
grouped = df.groupby('Type')['Item'].apply(lambda tags:', '.join(tags)).to_frame()
print(grouped)
Output:
Item
Type
Cars Toyota, Honda, Tesla
Countries USA, Mexico
Fruits Apple, Orange