Am new to Python and working through some exercises.
I have a column in my data called 'sequels' (for books) with numbers 1 through to 8.
I want to make a new column called 'sequelcategory' which relabels the numbers - I want 1 to be renamed to 'Original' and anything else to be renamed to Sequel. The exercise suggests that I use "pd.Series.cat.rename_categories" to do this.
The first hurdle I overcame was beating an error that said I needed to have categorical data (it was initially int64), I did this with:
bookdata['sequels'] = bookdata['sequels'].astype('category')
That was all well and good. I think set to creating my new column:
bookdata["sequelcategory"] = bookdata["sequels"].cat.rename_categories({1: 'original', 2: 'sequel'})
The above works absolutely fine - the problem I am having is that I also want numbers 3 - 8 to also be relabelled 'sequel', meaning that the below:
bookdata["sequelcategory"] = bookdata["sequels"].cat.rename_categories({1: 'original', 2: 'sequel', 3: 'sequel', 4: 'sequel', 5: 'sequel', 6: 'sequel', 7: 'sequel', 8: 'sequel', })
...returns the error: ValueError: Categorical categories must be unique.
Anyone have some advice on the above? I know there are probably 101 other ways to do this, but I am being told I need to do it with pandas.Series.cat.rename_categories and can't for the life of me work it out.
Any help would be greatly appreciated!
CodePudding user response:
We could map them before setting them as category,
bookdata = pd.DataFrame({'book series': [1, 2, 3, 4, 5, 1, 1, 2, 6, 8]})
bookdata
###
book series
0 1
1 2
2 3
3 4
4 5
5 1
6 1
7 2
8 6
9 8
map_dict = {1: 'original', 2: 'sequel', 3: 'sequel', 4: 'sequel', 5: 'sequel', 6: 'sequel', 7: 'sequel', 8: 'sequel'}
bookdata['sequelcategory'] = bookdata['book series'].map(map_dict).astype('category')
bookdata
###
book series sequelcategory
0 1 original
1 2 sequel
2 3 sequel
3 4 sequel
4 5 sequel
5 1 original
6 1 original
7 2 sequel
8 6 sequel
9 8 sequel
bookdata.info()
###
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 book series 10 non-null int64
1 sequelcategory 10 non-null category