Home > Mobile >  Converting all pandas column: row to key:value pair json
Converting all pandas column: row to key:value pair json

Time:08-17

I am trying to add a new column at the end of my pandas dataframe that will contain the values of previous cells in key:value pair. I have tried the following:

import json

df["json_formatted"] = df.apply
            (
                lambda row: json.dumps(row.to_dict(), ensure_ascii=False), axis=1
            )

It creates the the column json_formatted successfully with all required data, but the problem is it also adds the json_formatted as another extra key. I don't want that. I want the json data to contain only the information from the original df columns. How can I do that?

Note: I made ensure_ascii=False because the column names are in Japanese characters.

CodePudding user response:

Create a new variable holding the created column and add it afterwards:

json_formatted = df.apply(lambda row: json.dumps(row.to_dict(), ensure_ascii=False), axis=1)
df['json_formatted'] = json_formatted

CodePudding user response:

  1. This behaviour shouldn't happen, but might be caused by your having run this function more than once. (You added the column, and then ran df.apply on the same dataframe).

  2. You can avoid this by making your columns explicit: df[['col1', 'col2']].apply()

  3. Apply is an expensive operation is Pandas, and if performance matters it is better to avoid it. An alternative way to do this is

    df["json_formatted"] = [json.dumps(s, ensure_ascii=False) for s in df.T.to_dict().values()]
    
  • Related