Home > OS >  Trying to filter out unique values in a pandas data frame
Trying to filter out unique values in a pandas data frame

Time:04-24

Hi is there a way to filter out unique values ina a pandas data frame. I am using the code below to filter out the unique values. However, I am getting different ordered combinations. For example, ['Creative, Modern Cuisine', 'Modern Cuisine, Creative'] is there a way to filter this out.

[Part of the data]

cuisine = df.Cuisine.unique()
cuisine_count = df.Cuisine.nunique()
print(cuisine, cuisine_count)

CodePudding user response:

If I understand your intent, you are trying to get a list of all distinct cuisines which appear in your DataFrame. Try this:

df['Cuisine'].str.split(',').explode().str.strip().unique().tolist()

Explanation:

  • df['Cuisine'].str.split(','): split Cuisine strings at commas, producing a Series with a Python list in each row, where each list item holds an individual cuisine string
  • .explode(): for each list of cuisine strings, transform each string to a row
  • .str.strip(): strip whitespace
  • .unique().tolist(): get list of unique cuisines
  • Related