How to remove u' when running pandas' df.columns function-CodePudding

I have a data frame as shown below.

df:

id         fcb         psg         rma
1          4.0         2.9         4.1
2          3.5         4.2         3.5
3          2.5         4.5         4.0
4          4.1         4.6         4.2

I desired to see all of the ratings in a single column. So I ran the code below.

df.set_index(['id']).apply(dict, axis=1).reset_index(name='ratings')

However, I obtained the following results. But I wanted to remove the letter 'u" from all of the keys in each dictionary.

id  ratings
1   {u'fcb': 4.0, u'psg': 2.9, u'rma': 4.1}
2   {u'fcb': 3.5, u'psg': 4.2, u'rma': 3.5}
3   {u'fcb': 2.5, u'psg': 4.5, u'rma': 4.0}
4   {u'fcb': 4.1, u'psg': 4.6, u'rma': 4.2}

Expected output:

id  ratings
1   {'fcb': 4.0, 'psg': 2.9, 'rma': 4.1}
2   {'fcb': 3.5, 'psg': 4.2, 'rma': 3.5}
3   {'fcb': 2.5, 'psg': 4.5, 'rma': 4.0}
4   {'fcb': 4.1, 'psg': 4.6, 'rma': 4.2}

I tried below code to eliminate leading unicode in each keys.

df['rec_dict'] = df['rec_dict'].apply(lambda x: {str(k[1:]): v for k, v in x.items()})

CodePudding user response：

Try this

df = df.set_index(['id']).apply(dict, axis=1).reset_index(name='ratings')
df['ratings'] = df['ratings'].apply(lambda x: {float(k[1:]): v for k, v in x.items()})

The first line is the same as your original code, but the second line uses the apply() function along with a lambda function to iterate over each dictionary in the 'ratings' column, and for each key-value pair, it converts the key from a string to a float and removes the 'u' from the beginning of the key. The resulting dictionary is then assigned back to the corresponding row in the 'ratings' column.

CodePudding user response：

u'string' denotes a unicode string in Python. Since Python 3 all strings have unicode encoding by default. So you can safely ignore the notation.