Python - How to count the frequency of each unique key from a column containing a dictionary of dict-CodePudding

I have a very large dataframe containing a column called 'time_words'. Each cell of the column contains a list of dictionaries, for example:

time_columns
{'Yesterday': {'text': 'Yesterday', 'type': 'DATE', 'value': '2022-04-15'}}
{'Yesterday': {'text': 'Yesterday', 'type': 'DATE', 'value': '2022-04-16'}, 'Thursday': {'text': 'Thursday', 'type': 'DATE', 'value': '2022-04-14'}}

How can I efficiently get a table containing the frequency count of the unique keys of the main dictionary like below? (In a table because I want to save the result to a CSV.)

text	count
Yesterday	2
Thursday	1

CodePudding user response：

Try:

df = (
    df["time_columns"]
    .explode()
    .value_counts()
    .reset_index(name="count")
    .rename(columns={"index": "text"})
)
print(df)

Prints:

        text  count
0  Yesterday      2
1   Thursday      1

CodePudding user response：

Given the input data, could you try this ?

tmp=pd.concat(([pd.DataFrame.from_dict(v,orient='index') for k,v in df['time_columns'].items()]))
tmp['text'].value_counts()

CodePudding user response：

The easy way would be to just iterate through list and save results to new dictionary sth like:

res = {}
for dict in df['time_columns']:
    for key in dict.keys():
        if key not in res.keys():
             res[key] = 1
        else:
             res[key]  = 1

If you know keys in advance you can initialize dict with keys and zeros and replace if statement inside the loop with just increment.

keys = ['Yesterday', 'Thursday', 'etc.']
res = {key: 0 for key in keys}
for dict in df['time_columns']:
    for key in dict.keys():
        res[key]  = 1