I have the following dataframe
np.random.seed(3)
s = pd.DataFrame((np.random.choice(['Feijão','feijão'],size=[3,2])),dtype='category')
print(s[0].cat.categories)
print(s[1].cat.categories)
As you can see the dataframe is basically two similar strings with one letter in uppercase. What I am trying to do is replace the category 'feijão' with 'Feijão'
When I write the following line of code I get this error
s.loc[s[0].isin(['feijão']),1] = s.loc[s[0].isin(['feijão']),1].replace({'feijão':'Feijão'})
TypeError: Cannot set a Categorical with another, without identical categories
I was wondering what does this error means, and also I am genuinely curious if filtering the invalid values and replacing them uniquely on the dataframe is the most optimal way of doing this. Should I just use replace without the filter part?
CodePudding user response:
Use DataFrame.update
:
s.update( s.loc[s[0].isin(['feijão']),1].replace({'feijão':'Feijão'}))
print (s)
0 1
0 Feijão Feijão
1 feijão Feijão
2 Feijão Feijão