How to extract substring from column-CodePudding

I have a table where in one of the columns the data is a string that looks like this:

[{'label': 1, 'prob': 0.5001602182558444}, {'label': 0, 'prob': 0.49983978174415555}]

I want to extract just the number from label :1. So it would look like this:

0.5001602182558444

I can't seem to figure out how to do it with the standard methods since it has so many symbols which mess with the syntax.

CodePudding user response：

You could use a generator with next:

next((x['prob'] for x in lst if x['label'] == 1), None)

0.5001602182558444

Notice I add a None in case for no matches, if you're sure there would be matches you can strip that None:

next((x['prob'] for x in lst if x['label'] == 1))

CodePudding user response：

Another way is to use pandas, create a dataframe and look it up:

import pandas as pd

a = [{'label': 1, 'prob': 0.5001602182558444}, {'label': 0, 'prob': 0.49983978174415555}]
df = pd.DataFrame(a)
df.loc[df["label"]==1, "prob"].values[0]
# 0.5001602182558444

CodePudding user response：

Try this:

lst = [{'label': 1, 'prob': 0.5001602182558444}, {'label': 0, 'prob': 0.49983978174415555}]

def srch_dct(lst, key_search, value_search, key_want):
    for d in lst:
        if d.get(key_search) == value_search:
            return d.get(key_want)
    return None

Output:

>>> print(srch_dct(lst, 'label',1, 'prob'))
0.5001602182558444
>>> print(srch_dct(lst, 'label',0, 'prob'))
0.49983978174415555
>>> print(srch_dct(lst, 'label',2, 'prob'))
None
>>> print(srch_dct(lst, 'label2',2, 'prob2'))
None