How can I change the output here?
{0: {'count': 2.0, 'mean': 140950000.0, 'std': 0.0, 'min': 140950000.0, '25%': 140950000.0, '50%': 140950000.0, '75%': 140950000.0, 'max': 140950000.0}}
to
{'count': 2.0, 'mean': 140950000.0, 'std': 0.0, 'min': 140950000.0, '25%': 140950000.0, '50%': 140950000.0, '75%': 140950000.0, 'max': 140950000.0}
Is that directly possible in this code line? v is a list of integers/flaots to check for descriptive statistics (could be more than 500 elements inside):
np_arr = pd.DataFrame(np.around(np.array(v), 5)).describe().to_dict()
CodePudding user response:
Could you clarify your question?
import pandas as pd
import numpy as np
test = pd.DataFrame()
test["A"] = np.around(np.random.uniform(low=0.1, high=10, size=(500,)))
test.A.describe().to_dict()
already returns:
{'count': 500.0,
'mean': etc...}
EDIT: I suppose you are calling describe for a DataFrame with multiple columns. In that case you would obtain:
{'col1': {'count': 500.0,
'mean': 5.114,
'std': 2.9192547158808178,
'min': 0.0,
'25%': 3.0,
'50%': 5.0,
'75%': 8.0,
'max': 10.0},
'col2': {'count': 500.0,
'mean': 4.962,
'std': 2.917117609953462,
'min': 0.0,
'25%': 2.0,
'50%': 5.0,
'75%': 7.0,
'max': 10.0}}
If all you care about is the statistical description, without reference to which column it descibes, you could select values only: i.e.
np_arr = pd.DataFrame(np.around(np.array(v), 5)).describe().to_dict().values()
which would return:
[
{'count': 500.0, 'mean': 5.254, 'std': 2.9103994312979466, 'min': 0.0, '25%': 3.0, '50%': 5.0, '75%': 8.0, 'max': 10.0},
{'count': 500.0, 'mean': 4.882, 'std': 2.8678432750239664, 'min': 0.0, '25%': 2.0, '50%': 5.0, '75%': 7.0, 'max': 10.0}
]