Insert list json objects into row based on other column values in dataframe-CodePudding

I have dataframe with the following columns:

ID A1 B1 C1 A2 B2 C2 A3 B3 C4
AA  1  3  6           4  0  6 
BB  5  5  4  6  7  9 
CC  5  5  5

I want to create a new column called Z that takes each row, groups them into a JSON list of records, and renames the column as their key. After the JSON column is constructed, I want to drop all the columns and keep Z and ID only.

Here is the output desired:

ID Z
AA [{"New Column Name A":1, "New Column Name B":3,"New Column Name C":6},{"New Column Name A":4, "New Column Name B":0,"New Column Name C":6}]
BB [{"New Column Name A":5, "New Column Name B":5,"New Column Name C":4},{"New Column Name A":6, "New Column Name B":7,"New Column Name C":9}]
CC [{"New Column Name A":5, "New Column Name B":5,"New Column Name C":5}]

I've tried to group the fields together, apply a lambda function, and use to_dic('records'), but I'm not getting the output I want. I'm not even sure if my approach is correct. Any help is greatly appreciated.

CodePudding user response：

Have you tried to go line by line?? I am not very good with pandas and python. But I have me this code. Hope it works for you.

toAdd = []
for row in dataset.values:
    toAddLine = {}
    i = 0
    for data in row:
        
        if data is not None:
            toAddLine["New Column Name " dataset.columns[i]] = data
        i = i  1 
            
    toAdd.append(toAddLine)
dataset['Z'] = toAdd
dataset['Z']

CodePudding user response：

# create a columns name map for chang related column
columns = dataset.columns
columns_map = {}
for i in columns:
    columns_map[i] = f"new {i}"

def change_row_to_json(row):
    new_dict = {}
    for index, value in enumerate(row):
        new_dict[columns_map[columns[index]]] = value
    return json.dumps(new_dict, indent = 4)

dataset.loc[:,'Z'] = dataset.apply(change_row_to_json, axis=1)
dataset= dataset[["ID", "Z"]]