Having a data frame as below:
data = {'Name':['Mathew', 'Mathew', 'Mathew', 'Mathew','Mathew','John','John','John'],
'Age':[12,12, 12,13, 13,12,13,13],
'Colour':['Yellow','Blue','Yellow','green','blue','pink','black','brown']}
df = pd.DataFrame(data)
df
I tried a loop as below. I need the unique value in based on the loop. Like, Name is mathew and age is 12, mylist contain the values yellow and blue.
for j in set(df.Name):
for i in set(df.Age):
#my_list=list of unique colour produced in according to each loop
print(i,j)
output of above code will be as follows:
12 John
13 John
12 Mathew
13 Mathew
But what i want as output is,df.Name is mathew and df.age is 12, then mylist will be as follows: my_list=['yellow','blue']
CodePudding user response:
Here is your solution in a for loop:
for name in set(df['Name']):
for age in set(df['Age']):
print(name, age)
my_list = df.loc[(df['Name']==name) & (df['Age']==age),'Colour'].unique().tolist()
print(my_list,'\n')
Output:
Mathew 12
['Yellow', 'Blue']
Mathew 13
['green', 'blue']
John 12
['pink']
John 13
['black', 'brown']
other approaches:
out = df.groupby(['Name','Age'])['Colour'].unique()
print(out)
Output:
Name Age
John 12 [pink]
13 [black, brown]
Mathew 12 [Yellow, Blue]
13 [green, blue]
Name: Colour, dtype: object
To make it a bit more clear, since you seem to be a bit confused. out
is a pd.Series where you have for each index (which is a pair of name and age) all unqiue colors as an array.
Since you really want single lists as output, here you go:
John12, John13, Mathew12, Mathew13 = df.groupby(['Name','Age'])['Colour'].apply(lambda x: list(np.unique(x)))
print(John12)
print(John13)
print(Mathew12)
print(Mathew13)
Now you have 4 single lists for each combination, which you can use anywhere else.
['pink']
['black', 'brown']
['Blue', 'Yellow']
['blue', 'green']