Pandas/ Python list values of one column based on string value of another column-CodePudding

I have a .csv like this, with these columns:

Receipt ID	Name	Quantity	Category Type
135135	Croissant	1.0	Food
135135	Cappucino	1.0	Drink
143143	Salad	1.0	Food
154134	Americano	1.0	Drink
178781	Cappucino	1.0	Drink
169071	Muffin	1.0	Food
169071	Latte	1.0	Drink
169071	Brownie	1.0	Food

I want to get the 'Receipt ID's where the 'Category Type' is 'Food'

I have tried a few methods but none of them works

df1 = df.query('Category Type == Food')['Receipt ID'].unique()

Does not work

I also tried setting Category Type as index

df1 = df.set_index('Category Type').eq('Food')

print (df1.index[df1['Receipt ID']].tolist())

Which gave me an empty list

The Receipt IDs are not necessarily unique, although I want the outputs to be unique, and the final goal is to find the Receipt ID that contains both food and drink. Could any expert please give me some help please? Thank you!

CodePudding user response：

df.where(df['Category Type'] == 'Food')['Receipt ID'].dropna().values.tolist()

if you want unique:

df.where(df['Category Type'] == 'Food')['Receipt ID'].dropna().unique().astype(int).tolist()

df.loc[df['Category Type'] == 'Food', 'Receipt ID'].unique().tolist()

for all types:

df.groupby('Category Type').agg({'Receipt ID': 'unique'}).to_dict()

CodePudding user response：


import pandas as pd
from io import StringIO

data_str = """
Receipt ID  Name    Quantity    Category Type
135135  Croissant   1.0 Food
135135  Cappucino   1.0 Drink
143143  Salad   1.0 Food
154134  Americano   1.0 Drink
178781  Cappucino   1.0 Drink
169071  Muffin  1.0 Food
169071  Latte   1.0 Drink
169071  Brownie 1.0 Food
"""
# This is myself organizing the data, you can skip it here
io_str = StringIO(data_str)
df = pd.read_csv(io_str, header=0, sep='\t')

# start here
# method 1 
filter_df = df[df['Category Type'] == 'Food']
unique_list = filter_df['Receipt ID'].unique().tolist()
print(unique_list)

# method 2 use loc function
unique_list=df.loc[df['Category Type'] == 'Food', 'Receipt ID'].unique().tolist()
print(unique_list)
"""
[135135, 143143, 169071]
"""