I have this dataframe of 11527 rows in csv format constitued of a list of marks and models of boats
Ultramar Ultra 570 Open
Ultramar Ultra 600
Ultramar Ultra 660 Open
Ultramar Ultramar 440 Open
Ultramar Week End 550
Ultramar Week End 600
Ultramar Week End 650
Ultramar Week End 700
Ultramar Winner 650
Ultramar Winner 800
Ultramar Orque 70
Gobbi Atlantis 47
Gobbi Atlantis 55
Gobbi Gobbi 19
Gobbi Gobbi 21 Sport
Gobbi Gobbi 225 Cabin
Gobbi Gobbi 225 Sport
Gobbi Gobbi 23 Cabin
Gobbi Gobbi 23 Offshore
Gobbi Gobbi 23 Sport
Gobbi Gobbi 245 Cabin
I want to obtain a sublist from this dataframe when i call a particular mark, it returns all the models for this mark
i basically get the start :
import pandas as pd
pd.read_csv("marque_modele_ref.csv", sep=";")
but i don't know how to create separate lists from the mark and models and make some loop to call every models by distincts marks, a bit like a group by in SQL
Any idea ?
Regards,
CodePudding user response:
Having the following DataFrame
>>> df
make model1 model2 model3
0 Ultramar Ultra 600 None
1 Ultramar Ultra 660 Open
2 Ultramar Ultramar 440 Open
3 Ultramar Week End 550
4 Ultramar Week End 600
5 Ultramar Week End 650
6 Ultramar Week End 700
7 Ultramar Winner 650 None
8 Ultramar Winner 800 None
9 Ultramar Orque 70 None
10 Gobbi Atlantis 47 None
11 Gobbi Atlantis 55 None
12 Gobbi Gobbi 19 None
13 Gobbi Gobbi 21 Sport
14 Gobbi Gobbi 225 Cabin
15 Gobbi Gobbi 225 Sport
16 Gobbi Gobbi 23 Cabin
17 Gobbi Gobbi 23 Offshore
18 Gobbi Gobbi 23 Sport
19 Gobbi Gobbi 245 Cabin
We can select a specific 'make' and then collect the corresponding models into a list by using .loc
and tolist()
df.loc[df['make'] == 'Ultramar','model1'].tolist()
gives us
['Ultra', 'Ultra', 'Ultramar', 'Week', 'Week', 'Week', 'Week', 'Winner', 'Winner', 'Orque']
You probably needs the other three columns too, in which case you can do
df.model3 = df.model3.fillna('')
model = df.model1 ' ' df.model2 ' ' df.model3 # concat the values with a space
df.model = model # assign to the original DataFrame
df.loc[df['make'] == 'Ultramar','model'].tolist()
that gives
['Ultra 600 ', 'Ultra 660 Open', 'Ultramar 440 Open', 'Week End 550', 'Week End 600', 'Week End 650', 'Week End 700', 'Winner 650 ', 'Winner 800 ', 'Orque 70 ']