How to convert every dictionary in a list to a nested dictionary in python?-CodePudding

I am using pandas and numpy libraries, to calculate the pearson correlation of two simple lists. The output of the below code is the matrix of correlation:

import numpy as np
import pandas as pd

x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([2, 1, 4, 5, 8, 12, 18, 25, 96, 48])
z = np.array([5, 3, 2, 1, 0, -2, -8, -11, -15, -16])

x, y, z = pd.Series(x), pd.Series(y), pd.Series(z)

xyz = pd.DataFrame({'dist-values': x, 'uptime-values': y, 'speed-values': z})


matrix = xyz.corr(method="pearson")

After using the .unstack(), and .to_dict() functions on the output we can have a dictionary in the below format, and base on the answer on this post, we can convert the output to a list of dictionaries:

result = (matrix.unstack().rename_axis(['f1', 'f2'])
                          .reset_index(name='value').to_dict('records')
         )
# the output format after printing
[{'f1': 'dist-values', 'f2': 'dist-values', 'value': 1.0}, 
 {'f1': 'dist-values', 'f2': 'uptime-values', 'value': 0.7586402890911869}, 
 {'f1': 'dist-values', 'f2': 'speed-values', 'value': -0.9680724198337364}, 

 {'f1': 'uptime-values', 'f2': 'dist-values', 'value': 0.7586402890911869}, 
 {'f1': 'uptime-values', 'f2': 'uptime-values', 'value': 1.0}, 
 {'f1': 'uptime-values', 'f2': 'speed-values', 'value': -0.8340792243486527}, 

 {'f1': 'speed-values', 'f2': 'dist-values', 'value': -0.9680724198337364}, 
 {'f1': 'speed-values', 'f2': 'uptime-values', 'value': -0.8340792243486527}, 
 {'f1': 'speed-values', 'f2': 'speed-values', 'value': 1.0}]

But I need a more complicated format, and the output should be like this:

[ 
 {'name': 'dist-values', 'data': [{'x': 'dist-values', 'y': 1.0}, {'x': 'uptime-values', 'y': 0.7586402890911869}, {'x': 'speed-values', 'y': -0.9680724198337364}]}, 
 {'name': 'uptime-values', 'data': [{'x': 'dist-values', 'y': 0.7586402890911869}, {'x': 'uptime-values', 'y': 1.0}, {'x': 'speed-values', 'y': -0.8340792243486527}]}, 
 {'name': 'speed-values', 'data': [{'x': 'dist-values', 'y': -0.9680724198337364}, {'x': 'uptime-values', 'y': -0.8340792243486527}, {'x': 'speed-values', 'y': 1.0}]}, 
]

There are only three features in this code, and the correlation matrix has only 9 elements, but in a bigger matrix, how we can implement this conversion? Is there an efficient way to do it? Thanks.

CodePudding user response：

You can try list-comprehension to obtain your output:

out = [
    {"name": i, "data": [{"x": c, "y": row[c]} for c in row.index]}
    for i, row in matrix.iterrows()
]
print(out)

Prints:

[
    {
        "name": "dist-values",
        "data": [
            {"x": "dist-values", "y": 1.0},
            {"x": "uptime-values", "y": 0.7586402890911869},
            {"x": "speed-values", "y": -0.9680724198337364},
        ],
    },
    {
        "name": "uptime-values",
        "data": [
            {"x": "dist-values", "y": 0.7586402890911869},
            {"x": "uptime-values", "y": 1.0},
            {"x": "speed-values", "y": -0.8340792243486527},
        ],
    },
    {
        "name": "speed-values",
        "data": [
            {"x": "dist-values", "y": -0.9680724198337364},
            {"x": "uptime-values", "y": -0.8340792243486527},
            {"x": "speed-values", "y": 1.0},
        ],
    },
]

CodePudding user response：

The first answer is way better

from collections import defaultdict

lst1 = [
    {'f1': 'dist-values', 'f2': 'dist-values', 'value': 1.0}, 
    {'f1': 'dist-values', 'f2': 'uptime-values', 'value': 0.7586402890911869}, 
    {'f1': 'dist-values', 'f2': 'speed-values', 'value': -0.9680724198337364}, 

    {'f1': 'uptime-values', 'f2': 'dist-values', 'value': 0.7586402890911869}, 
    {'f1': 'uptime-values', 'f2': 'uptime-values', 'value': 1.0}, 
    {'f1': 'uptime-values', 'f2': 'speed-values', 'value': -0.8340792243486527}, 

    {'f1': 'speed-values', 'f2': 'dist-values', 'value': -0.9680724198337364}, 
    {'f1': 'speed-values', 'f2': 'uptime-values', 'value': -0.8340792243486527}, 
    {'f1': 'speed-values', 'f2': 'speed-values', 'value': 1.0}
]

dct2 = defaultdict(list)

for row in lst1:
    dct2[row['f1']].append({'x':row['f2'], 'y':row['value']})

lst2 = [{'name':k, 'data':v} for k, v in dct2.items()]

print(lst2)

output:

[
    {'name': 'dist-values', 'data': [
        {'x': 'dist-values', 'y': 1.0},
        {'x': 'uptime-values', 'y': 0.7586402890911869},
        {'x': 'speed-values', 'y': -0.9680724198337364}]
    },
    {'name': 'uptime-values', 'data': [
        {'x': 'dist-values', 'y': 0.7586402890911869}, 
        {'x': 'uptime-values', 'y': 1.0}, 
        {'x': 'speed-values', 'y': -0.8340792243486527}]
    },
    {'name': 'speed-values', 'data': [
        {'x': 'dist-values', 'y': -0.9680724198337364}, 
        {'x': 'uptime-values', 'y': -0.8340792243486527}, 
        {'x': 'speed-values', 'y': 1.0}]
    }
]