I would like to locate a specific row (given all its columns values) within a pandas frame. My attempts so far:
df = pd.DataFrame(
columns = ["A", "B", "C"],
data = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
])
# row to find (last one)
row = {"A" : 10, "B" : 11, "C" : 12}
# chain
idx = df[(df["A"] == 10) & (df["B"] == 11) & (df["B"] == 11)].index[0]
print(idx)
# iterative
mask = pd.Series([True] * len(df))
for k, v in row.items():
mask &= (df[k] == v)
idx = df[mask].index[0]
print(idx)
# pandas series
for idx in df.index:
print(idx, (df.iloc[idx,:] == pd.Series(row)).all())
Is there a simpler way to do that? Something like idx = df.find(row)
?
This functionality is often needed for example to locate one specific sample in a time series. I cannot believe that there is no straightforward way to do that.
CodePudding user response:
Do you simply want?
df[df.eq(row).all(1)] #.index # if the indices are needed
output:
A B C
3 10 11 12
Or, if you have more columns and want to ignore them for the comparison:
df[df[list(row)].eq(row).all(1)]