I have a data frame as shown below.
df:
id fcb psg rma
1 4.0 2.9 4.1
2 3.5 4.2 3.5
3 2.5 4.5 4.0
4 4.1 4.6 4.2
I desired to see all of the ratings in a single column. So I ran the code below.
df.set_index(['id']).apply(dict, axis=1).reset_index(name='ratings')
However, I obtained the following results. But I wanted to remove the letter 'u" from all of the keys in each dictionary.
id ratings
1 {u'fcb': 4.0, u'psg': 2.9, u'rma': 4.1}
2 {u'fcb': 3.5, u'psg': 4.2, u'rma': 3.5}
3 {u'fcb': 2.5, u'psg': 4.5, u'rma': 4.0}
4 {u'fcb': 4.1, u'psg': 4.6, u'rma': 4.2}
Expected output:
id ratings
1 {'fcb': 4.0, 'psg': 2.9, 'rma': 4.1}
2 {'fcb': 3.5, 'psg': 4.2, 'rma': 3.5}
3 {'fcb': 2.5, 'psg': 4.5, 'rma': 4.0}
4 {'fcb': 4.1, 'psg': 4.6, 'rma': 4.2}
I tried below code to eliminate leading unicode in each keys.
df['rec_dict'] = df['rec_dict'].apply(lambda x: {str(k[1:]): v for k, v in x.items()})
CodePudding user response:
Try this
df = df.set_index(['id']).apply(dict, axis=1).reset_index(name='ratings')
df['ratings'] = df['ratings'].apply(lambda x: {float(k[1:]): v for k, v in x.items()})
The first line is the same as your original code, but the second line uses the apply() function along with a lambda function to iterate over each dictionary in the 'ratings' column, and for each key-value pair, it converts the key from a string to a float and removes the 'u' from the beginning of the key. The resulting dictionary is then assigned back to the corresponding row in the 'ratings' column.
CodePudding user response:
u'string'
denotes a unicode string in Python. Since Python 3 all strings have unicode encoding by default. So you can safely ignore the notation.