Home > database >  How to extract max value and respective key from multiple dictionary key & values inside an object c
How to extract max value and respective key from multiple dictionary key & values inside an object c

Time:03-12

I am trying to extract max value and respective key from multiple dictionary key & values inside an object column in python dataframe, can anyone help me out

df:

                                          Obj_values
0                    {"Sony":25,"Max":91,"James":55}
1                               {"Jack":12,"Max":10}
2                                                 {}
3
4                              {"Halk":21,"Hall":15}
5                                            {"H":1}
6                       {"Aws":"k","Az":113,"Gc":"b"}
7  {"Max":60,"HBO":113,"Sony":55, "WS":1256,"James":55}

I have tried something like below, but still not able to convert the dataframe to dict:

obj_lst = df["Obj_values"].tolist()
df["Obj_values"] = ast.literal_eval(str(obj_lst))

df["Obj_values"].dtype
# dtype('O')

for k,v in df["Obj_values"].items():
    print(k,v)

Expect result: enter image description here

CodePudding user response:

Probably a bit slow because your data are messy:

  • empty row
  • empty dict
  • string values

Try:

def f_max(d):
    k = v = ''
    if d:
        d = {k: v for k, v in d.items() if isinstance(v, (int, float))}
        if d:
            k = max(d, key=d.get)
            v = d[k]
    return pd.Series({'Name': k, 'Value': v})

df = pd.concat([df, df['Obj_values'].apply(f_max)], axis=1)

Output:

>>> df
                                          Obj_values  Name Value
0               {'Sony': 25, 'Max': 91, 'James': 55}   Max    91
1                            {'Jack': 12, 'Max': 10}  Jack    12
2                                                 {}            
4                                                               
5                           {'Halk': 21, 'Hall': 15}  Halk    21
6                                           {'H': 1}     H     1
7                 {'Aws': 'k', 'Az': 113, 'Gc': 'b'}    Az   113
8  {'Max': 60, 'HBO': 113, 'Sony': 55, 'WS': 1256...    WS  1256

CodePudding user response:

You have the right idea parsing the dictionary using ast.literal_eval. There are various ways you could work with the dataframe, but once you have got your list of dict objects in obj_list it may be simpler to loop over the list to get the names and values for each row and append them as dataframe columns at the end. That's the approach I've gone for here.

dict.items gives you a list of (key, value) tuples, so you need need to find the maximum of these with the value as the key. This has been discussed in the following threads:

How to find the maximum value in a list of tuples?

Finding max value in the second column of a nested list?

There are various ways to do it but I have gone with using a lambda function for the key parameter (name, value = max(obj_as_dict.items(), key=lambda x: x[1])).

Full code below. I just took the first two rows of your dataframe to test it out.

import pandas as pd
import ast

df = pd.DataFrame(['{"Sony": 25,"Max":91,"James":55}', '{"Jack":12,"Max":10}'])
df.columns = ["Obj_values"]

obj_list = df['Obj_values'].tolist()

names = []
values = []
for obj in obj_list:
    obj_as_dict = ast.literal_eval(str(obj))
    name, value = max(obj_as_dict.items(), key=lambda x: x[1])
    names.append(name)
    values.append(value)

df['Name'] = names
df['Value'] = values

print(df)

Output:

                        Obj_values  Name  Value
0  {"Sony": 25,"Max":91,"James":55}   Max     91
1              {"Jack":12,"Max":10}  Jack     12
  • Related