I am trying to figure out the most efficient way to search within lists in a Dataframe in python. For example I have the following row in a pandas Dataframe
PLACES_DETECTED PLACE_VALUES
[A14, A09, A08, A15, A03] [-0.0369, -0.0065, 0.01, -0.0295, -0.0402]
[B10, B19, A18, A03, A14] [-0.0641, 0.0419, -0.0196, 0.0747, -0.0]
I want to search for example for A14 and get the place value for it, but I don't know if there is a better way to do it other then just doing a for loop with many if statements?
Thanks for your help!
I tried using df.loc[df['PLACES_DETECTED'] == 'A14'] which didn't work but even if it had it doesn't help me get the place value associated with it since that is in another column in the dataframe.
CodePudding user response:
I values in list in PLACES_DETECTED column are unique (no duplicities), you can create a temporary Series with a dict and then use .str
method:
tmp = df.apply(
lambda x: dict(zip(x["PLACES_DETECTED"], x["PLACE_VALUES"])), axis=1
)
df["A14"] = tmp.str["A14"]
print(df)
Prints:
PLACES_DETECTED PLACE_VALUES A14
0 [A14, A09, A08, A15, A03] [-0.0369, -0.0065, 0.01, -0.0295, -0.0402] -0.0369
1 [B10, B19, A18, A03, A14] [-0.0641, 0.0419, -0.0196, 0.0747, -0.0] -0.0000
OR: Use .explode
:
df = df.explode(["PLACES_DETECTED", "PLACE_VALUES"])
print(df[df.PLACES_DETECTED == "A14"])
Prints:
PLACES_DETECTED PLACE_VALUES
0 A14 -0.0369
1 A14 -0.0
CodePudding user response:
You could also do the following:
def get_place_value(row, place):
places = row['PLACES_DETECTED']
values = row['PLACE_VALUES']
if place in places:
index = places.index(place)
return values[index]
return None
place_values = df.apply(lambda row: get_place_value(row, 'A14'), axis=1)
That would also get you the following output:
0 -0.0369
1 -0.0000