I need to replace all values in the order
column that are not equal to 'no', 'n/a' or 'N/A' by 1.0
. I have tried converting it to a categorical variable and set the pre-existing categories as its distinct categories, but still get the same TypeError
df = pd.DataFrame({'otherdr': ['no', 'N/A', 'N/A', 'Intergov', 'Conciliation', 'yes']})
cat = list(df['otherdr'].unique())
df['otherdr'] = pd.Categorical(df['otherdr'], categories = cat, ordered = False)
df[df['otherdr'] != ('no' or 'n/a' or 'N/A')] = 1.0
TypeError: Cannot setitem on a Categorical with a new category (1.0), set the categories first
CodePudding user response:
Don't use a categorical. Once defined, you cannot add a non existing category (well you can if you explicitly add a new category first).
Use isin
where
:
df['otherdr'] = df['otherdr'].where(df['otherdr'].isin(['no', 'n/a', 'N/A']), 1)
If you really want/need a categorical, convert after replacing the values:
df['otherdr'] = pd.Categorical(df['otherdr'].where(df['otherdr'].isin(['no', 'n/a', 'N/A']), 1))
Output:
otherdr
0 no
1 N/A
2 N/A
3 1
4 1
5 1