I have a dataset comprising of categorial data as the one shown below:
How do I create a nested dictionary using this dataframe in such a way that the "key" will be the column, and the nested "key":"value" will be the "category":"number of times said categories occur"?
CodePudding user response:
You can use collections.Counter
to count the number of occurrences of each category. When fed an iterable (such as a DataFrame column), this will return a dict-like object of the type "category": count, like your inner dict.
To get this for each one of the columns, you could iterate over the columns, like so:
from collections import Counter
all_counts = {}
for column in df.columns:
all_counts[column] = Counter(df[column])
CodePudding user response:
Try as follows:
import pandas as pd
# sample
data = {'gender': ['Male','Female','Male'],
'heart_disease':[0,1,1]}
df = pd.DataFrame(data)
a_dict = {}
for x in df.columns:
a_dict[x] = df[x].value_counts().to_dict()
print(a_dict)
{'gender': {'Male': 2, 'Female': 1}, 'heart_disease': {1: 2, 0: 1}}