I have a df like this where df['value_list'] has dtype
object:
value_list
0 ['200000.00', '100.00', '25.00']
1 ['860000.00']
2 ['148000.00', '25.00']
I want a new column max_value
which will be the max integer numerical value of the string numbers inside this list. So expected results for above:
value_list max_value
0 ['200000.00', '100.00', '25.00'] 200000
1 ['860000.00'] 860000
2 ['148000.00', '25.00'] 148000
What I did:
# covert strings to numerical values
df['value_list'] = df['value_list'].apply(lambda x: ast.literal_eval(x))
# get max values from list and if list is empty temporarily assign -1
df['max_value'] = df['value_list'].apply(lambda x: max(x) if len(x) >= 1 else -1)
df['max_value'] = df['max_value'].apply(lambda x: round(float(x)) if x != -1 else None)
Which seems to have worked but I didn't actually get the max values. The results I got:
value_list max_value
0 ['200000.00', '100.00', '25.00'] 25
1 ['860000.00'] 860000
2 ['148000.00', '25.00'] 25
Is there a better and more correct way to do this? Please advise.
CodePudding user response:
One approach:
import pandas as pd
# input toy data
df = pd.DataFrame([[['200000.00', '100.00', '25.00']],
[['860000.00']],
[['148000.00', '25.00']]], columns=["value_list"])
df["max_value"] = df["value_list"].apply(lambda x: max(float(e) for e in x))
print(df)
Output
value_list max_value
0 [200000.00, 100.00, 25.00] 200000.0
1 [860000.00] 860000.0
2 [148000.00, 25.00] 148000.0
If you have empty lists and want to return -1
, just do:
df["max_value"] = df["value_list"].apply(lambda x: max(float(e) for e in x) if x else -1)