I have a CSV file with the following layout /content
datetime,key_value_column
2022-02-02 00:00:01,"{'key1':2,'key2':7,'key3':100}"
2022-02-02 00:00:10,"{'key5':2,'key2':3,'key3':1,'key4':5}"
The datetime column is unique
I would like to have python script that sums the values in the dictionary for each row ie
2022-02-02 00:00:01,107
2022-02-02 00:00:10,11
I can read the file
df = pd.read_csv('test.csv')
print(df)```
but no idea how to sum the dictionary values across the data frame
I worked out summing values for a dictionary example below but I am
hoping to get some hints on how to process the dataframe
```adict = {'key1':2,'key2':7,'key3':100}
sumvalues = adict.values()
print(sum(sumvalues))```
Thanks
CodePudding user response:
If there are dictionaries and keys are numeric use lambda function only:
df['key_value_column'] = df['key_value_column'].apply(lambda x: sum(x.values()))
Or if there are string repr of dictonaries first replace them to dict by ast.literal_eval
:
import ast
df['key_value_column'] = df['key_value_column'].apply(lambda x: sum(ast.literal_eval(x).values()))