Home > Back-end >  Searching within lists in a pandas dataframe
Searching within lists in a pandas dataframe

Time:12-28

I am trying to figure out the most efficient way to search within lists in a Dataframe in python. For example I have the following row in a pandas Dataframe

PLACES_DETECTED                                              PLACE_VALUES
[A14, A09, A08, A15, A03]               [-0.0369, -0.0065, 0.01, -0.0295, -0.0402]

[B10, B19, A18, A03, A14]               [-0.0641, 0.0419, -0.0196, 0.0747, -0.0]

I want to search for example for A14 and get the place value for it, but I don't know if there is a better way to do it other then just doing a for loop with many if statements?

Thanks for your help!

I tried using df.loc[df['PLACES_DETECTED'] == 'A14'] which didn't work but even if it had it doesn't help me get the place value associated with it since that is in another column in the dataframe.

CodePudding user response:

I values in list in PLACES_DETECTED column are unique (no duplicities), you can create a temporary Series with a dict and then use .str method:

tmp = df.apply(
    lambda x: dict(zip(x["PLACES_DETECTED"], x["PLACE_VALUES"])), axis=1
)

df["A14"] = tmp.str["A14"]
print(df)

Prints:

             PLACES_DETECTED                                PLACE_VALUES     A14
0  [A14, A09, A08, A15, A03]  [-0.0369, -0.0065, 0.01, -0.0295, -0.0402] -0.0369
1  [B10, B19, A18, A03, A14]    [-0.0641, 0.0419, -0.0196, 0.0747, -0.0] -0.0000

OR: Use .explode:

df = df.explode(["PLACES_DETECTED", "PLACE_VALUES"])
print(df[df.PLACES_DETECTED == "A14"])

Prints:

  PLACES_DETECTED PLACE_VALUES
0             A14      -0.0369
1             A14         -0.0

CodePudding user response:

You could also do the following:

def get_place_value(row, place):
    places = row['PLACES_DETECTED']
    values = row['PLACE_VALUES']
    if place in places:
        index = places.index(place)
        return values[index]
    return None

place_values = df.apply(lambda row: get_place_value(row, 'A14'), axis=1)

That would also get you the following output:

0   -0.0369
1   -0.0000
  • Related