Home > Blockchain >  Selecting a row in pandas based on all its column values
Selecting a row in pandas based on all its column values

Time:06-15

I would like to locate a specific row (given all its columns values) within a pandas frame. My attempts so far:


df = pd.DataFrame(
    columns = ["A", "B", "C"],
    data = [
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
        [10, 11, 12],
        ])

# row to find (last one)
row = {"A" : 10, "B" : 11, "C" : 12}

# chain
idx = df[(df["A"] == 10) & (df["B"] == 11) & (df["B"] == 11)].index[0]
print(idx)

# iterative
mask = pd.Series([True] * len(df))

for k, v in row.items():
    mask &= (df[k] == v)

idx = df[mask].index[0]
print(idx)

# pandas series
for idx in df.index:
    print(idx,  (df.iloc[idx,:] == pd.Series(row)).all())

Is there a simpler way to do that? Something like idx = df.find(row)?

This functionality is often needed for example to locate one specific sample in a time series. I cannot believe that there is no straightforward way to do that.

CodePudding user response:

Do you simply want?

df[df.eq(row).all(1)] #.index # if the indices are needed

output:

    A   B   C
3  10  11  12

Or, if you have more columns and want to ignore them for the comparison:

df[df[list(row)].eq(row).all(1)]
  • Related