Home > Software engineering >  How to remove u' when running pandas' df.columns function
How to remove u' when running pandas' df.columns function

Time:01-27

I have a data frame as shown below.

df:

id         fcb         psg         rma
1          4.0         2.9         4.1
2          3.5         4.2         3.5
3          2.5         4.5         4.0
4          4.1         4.6         4.2 

I desired to see all of the ratings in a single column. So I ran the code below.

df.set_index(['id']).apply(dict, axis=1).reset_index(name='ratings')

However, I obtained the following results. But I wanted to remove the letter 'u" from all of the keys in each dictionary.

id  ratings
1   {u'fcb': 4.0, u'psg': 2.9, u'rma': 4.1}
2   {u'fcb': 3.5, u'psg': 4.2, u'rma': 3.5}
3   {u'fcb': 2.5, u'psg': 4.5, u'rma': 4.0}
4   {u'fcb': 4.1, u'psg': 4.6, u'rma': 4.2}

Expected output:

id  ratings
1   {'fcb': 4.0, 'psg': 2.9, 'rma': 4.1}
2   {'fcb': 3.5, 'psg': 4.2, 'rma': 3.5}
3   {'fcb': 2.5, 'psg': 4.5, 'rma': 4.0}
4   {'fcb': 4.1, 'psg': 4.6, 'rma': 4.2}

I tried below code to eliminate leading unicode in each keys.

df['rec_dict'] = df['rec_dict'].apply(lambda x: {str(k[1:]): v for k, v in x.items()})

CodePudding user response:

Try this

df = df.set_index(['id']).apply(dict, axis=1).reset_index(name='ratings')
df['ratings'] = df['ratings'].apply(lambda x: {float(k[1:]): v for k, v in x.items()})

The first line is the same as your original code, but the second line uses the apply() function along with a lambda function to iterate over each dictionary in the 'ratings' column, and for each key-value pair, it converts the key from a string to a float and removes the 'u' from the beginning of the key. The resulting dictionary is then assigned back to the corresponding row in the 'ratings' column.

CodePudding user response:

u'string' denotes a unicode string in Python. Since Python 3 all strings have unicode encoding by default. So you can safely ignore the notation.

  • Related