I have a dataframe. I want to get the list of all values of different classes.
df = pd.DataFrame([(3, 1),
(4, 3),
(6, 2),
(7, 2),
(2, 3),
(4, 2),
(4, 1),
(1, 3),
(6, 3),
(8, 1)],
columns=['Feature', 'Class'])
In the above example, I have three classes, namely 1, 2, and 3. I would like to get the output of all different lists of values in a class. The output can be following:
Class 1: [3, 4, 8]
Class 2: [6, 7, 4]
Class 3: [4, 2, 1, 6]
CodePudding user response:
You can do it simply:
classes = df.groupby('Class')['Feature'].apply(list)
Output:
>>> classes
Class
1 [3, 4, 8]
2 [6, 7, 4]
3 [4, 2, 1, 6]
Name: Feature, dtype: object
If you want to get all unique values, try this:
unique = df.groupby('Class')['Feature'].unique()
CodePudding user response:
As pointed out in this great answer, you can use the df.groupby()
method along with the df.apply()
method to achieve this:
import pandas as pd
df = pd.DataFrame([(3, 1),
(4, 3),
(6, 2),
(7, 2),
(2, 3),
(4, 2),
(4, 1),
(1, 3),
(6, 3),
(8, 1)],
columns=['Feature', 'Class'])
print(df.groupby('Class')['Feature'].apply(list))
Output:
Class
1 [3, 4, 8]
2 [6, 7, 4]
3 [4, 2, 1, 6]
Name: Feature, dtype: object
But if you want to loop through the class numbers one by one, a more intuitive way would be to do:
print(df.loc[df['Class'] == 1])
Output:
Feature Class
0 3 1
6 4 1
9 8 1
Or include the "Feature"
column to get:
print(df.loc[df['Class'] == 1]["Feature"])
Output:
0 3
6 4
9 8
Name: Feature, dtype: int64