Home > OS >  Remove index from .to_dict in a described pandas DataFrame
Remove index from .to_dict in a described pandas DataFrame

Time:10-31

How can I change the output here?

{0: {'count': 2.0, 'mean': 140950000.0, 'std': 0.0, 'min': 140950000.0, '25%': 140950000.0, '50%': 140950000.0, '75%': 140950000.0, 'max': 140950000.0}}

to

{'count': 2.0, 'mean': 140950000.0, 'std': 0.0, 'min': 140950000.0, '25%': 140950000.0, '50%': 140950000.0, '75%': 140950000.0, 'max': 140950000.0}

Is that directly possible in this code line? v is a list of integers/flaots to check for descriptive statistics (could be more than 500 elements inside):

np_arr = pd.DataFrame(np.around(np.array(v), 5)).describe().to_dict()

CodePudding user response:

Could you clarify your question?

import pandas as pd
import numpy as np

test = pd.DataFrame()
test["A"] = np.around(np.random.uniform(low=0.1, high=10, size=(500,)))
test.A.describe().to_dict()

already returns:

{'count': 500.0,
 'mean': etc...}

EDIT: I suppose you are calling describe for a DataFrame with multiple columns. In that case you would obtain:

{'col1': {'count': 500.0,
  'mean': 5.114,
  'std': 2.9192547158808178,
  'min': 0.0,
  '25%': 3.0,
  '50%': 5.0,
  '75%': 8.0,
  'max': 10.0},
 'col2': {'count': 500.0,
  'mean': 4.962,
  'std': 2.917117609953462,
  'min': 0.0,
  '25%': 2.0,
  '50%': 5.0,
  '75%': 7.0,
  'max': 10.0}}

If all you care about is the statistical description, without reference to which column it descibes, you could select values only: i.e.

np_arr = pd.DataFrame(np.around(np.array(v), 5)).describe().to_dict().values()

which would return:

[
{'count': 500.0, 'mean': 5.254, 'std': 2.9103994312979466, 'min': 0.0, '25%': 3.0, '50%': 5.0, '75%': 8.0, 'max': 10.0}, 

{'count': 500.0, 'mean': 4.882, 'std': 2.8678432750239664, 'min': 0.0, '25%': 2.0, '50%': 5.0, '75%': 7.0, 'max': 10.0}
]
  • Related