I Have a dataframe with a list of words and I was to keep the words unique words if mentioned multiple times or keep all word if only mentioned once.
My dataframe looks like this:
cars
[honda, toyota]
[honda, none, honda, toyota, toyota]
[lexus, mazda]
[honda, mazda, lexus, mazda, honda]
I want my out to be:
cars
honda, toyota
honda, none, toyota
lexus, mazda
honda, mazda, lexus
Thank you in advance!
CodePudding user response:
Make them into sets, a set inherently only has unique values.
Optionally, you can convert them back to lists again afterwards.
df.cars = df.cars.apply(set)#.apply(list)
CodePudding user response:
If order is important, use dict.fromkeys
that acts like an ordered set
(python ≥3.6):
df['cars'] = df['cars'].apply(lambda x: list(dict.fromkeys(x)))
Variant with a list comprehension that is potentially more efficient:
df['cars'] = [list(dict.fromkeys(x)) for x in df['cars']]