I am using pandas
and numpy
libraries, to calculate the pearson correlation of two simple lists. The output of the below code is the matrix of correlation:
import numpy as np
import pandas as pd
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([2, 1, 4, 5, 8, 12, 18, 25, 96, 48])
x, y = pd.Series(x), pd.Series(y)
xy = pd.DataFrame({'dist-values': x, 'uptime-values': y})
matrix = xy.corr(method="pearson")
After using the .unstack()
, and .to_dict()
functions on the output we can have a dictionary in the below format:
result = matrix.unstack().to_dict()
# {('dist-values', 'dist-values'): 1.0,
# ('dist-values', 'uptime-values'): 0.7586402890911869,
# ('uptime-values', 'dist-values'): 0.7586402890911869,
# ('uptime-values', 'uptime-values'): 1.0}
But I need to convert it to a list of dictionaries, and the output should be like this:
#[ {'f1': 'dist-values', 'f2': 'dist-values', 'value': '1.0'},
# {'f1': 'dist-values', 'f2': 'uptime-values', 'value': '0.7586402890911869'},
# {'f1': 'uptime-values', 'f2': 'dist-values', 'value': '0.7586402890911869'},
# {'f1': 'uptime-values', 'f2': 'uptime-values', 'value': '1.0'}
# ]
What's the best and efficient way to do it?
CodePudding user response:
What about:
result = (matrix.unstack().rename_axis(['f1', 'f2'])
.reset_index(name='value').to_dict('records')
)
output:
[{'f1': 'dist-values', 'f2': 'dist-values', 'value': 1.0},
{'f1': 'dist-values', 'f2': 'uptime-values', 'value': 0.7586402890911869},
{'f1': 'uptime-values', 'f2': 'dist-values', 'value': 0.7586402890911869},
{'f1': 'uptime-values', 'f2': 'uptime-values', 'value': 1.0}]