Let's say I have have a dataframe with a column called animals. The entries look as followed:
'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'E', 'F', 'G', 'H', 'I'.
I want to change the entries 'E', 'F', 'G', 'H' and 'I' to another unified entry called 'D'. What is the best way to transform all these categorical entries into one category?
CodePudding user response:
You can create a list
of the entries you want to change, and then you can assign 'D' for them using loc
to spot them, and isin
to evalute if your condition is satisfied:
li = ['E','F','G','H','I']
df.loc[df.animals.isin(li), 'animals'] = 'D'
An alternative to loc
, would be numpy
's where
:
df['animals'] = np.where(df['animals'].isin(li),'D',df['animals'])
Which reads: for every row in the animals column, check if the value is in the the list called li
and if it is return 'D', otherwise keep the column intact