Home > OS >  Unable to rename/replace categories in a dataframe after removing unicode u
Unable to rename/replace categories in a dataframe after removing unicode u

Time:03-31

I am trying to rename the categories in a dataframe after removing the unicode u with a .replace('u','',regex) method due to the method removing the other 'u's in the text as well. I have tried using the replace, and the rename_categories method to change the categories into desired format using a dictionary to map but it remains unchanged after removing the unicode u. Does anyone know a better way I can approach this? I have attached a link to the CSV I am working with.

enter image description here

'''uploaded = files.upload()
yelpdf = pd.read_csv(io.BytesIO(uploaded['yelp_reviews.csv']))
print(yelpdf['NoiseLevel'].value_counts())
yelpdf['NoiseLevel'] = yelpdf['NoiseLevel'].astype(str)
update_NoiseLevel = {'average': 'Average', 'lod': 'Loud', 'qiet': 'Quiet', 'very_lod': 'Very Loud'}
yelpdf['NoiseLevel'] = yelpdf['NoiseLevel'].replace('u','',regex=True)
yelpdf['NoiseLevel'] = yelpdf['NoiseLevel'].astype('category')
yelpdf['NoiseLevel'] = yelpdf['NoiseLevel'].cat.rename_categories(update_NoiseLevel)
yelpdf['NoiseLevel'] = yelpdf['NoiseLevel'].replace(update_NoiseLevel)

print(yelpdf['NoiseLevel'].value_counts())'''

4. Finally

print(df['NoiseLevel_u_removed'].value_counts())

df['NoiseLevel_category'] = df['NoiseLevel_u_removed'].astype('category')

df['NoiseLevel_u_removed'] = df['NoiseLevel_category'].cat.rename_categories(update_NoiseLevel)
df['NoiseLevel_u_removed'][0:23]

I hope this helps you !

  • Related