Home > Software engineering >  Dataframe get exact value using an array
Dataframe get exact value using an array

Time:10-29

Suppose I have the following dataframe:

        A   B   C   D   Count
0   0   0   0   0   12.0
1   0   0   0   1   2.0
2   0   0   1   0   4.0
3   0   0   1   1   0.0
4   0   1   0   0   3.0
5   0   1   1   0   0.0
6   1   0   0   0   7.0
7   1   0   0   1   9.0
8   1   0   1   0   0.0
... (truncated for readability)

And an array: [1, 0, 0, 1]

I would like to access Count value given the above values of each column. In this case, this would be row 7 with Count = 9.0

I can use iloc or at by deconstructing each value in the array, but that seems inefficient. Wondering if there's a way to map the values in the array to a value of a column.

CodePudding user response:

You can index the DataFrame with a list of the key column names and compare the resulting view to the array, using NumPy broadcasting to do it for each line at once. Then collapse the resulting Boolean DataFrame to a Boolean row index with all() and use that to index the Count column.

If df is the DataFrame and a is the array (or a list):

df.Count.loc[(df[list('ABCD')] == a).all(axis=1)]

CodePudding user response:

I just used the .loc command, and searched for the multiple conditions like this:

f = [1,0,0,1]
result = df['Count'].loc[(df['A']==f[0]) & 
                         (df['B']==f[1]) & 
                         (df['C']==f[2]) & 
                         (df['D']==f[3])].values
print(result)

OUTPUT:

[9.]

However, I like Arne's answer better :)

CodePudding user response:

You can try with tuple

out = df.loc[df[list('ABCD')].apply(tuple,1) == (1, 0, 0, 1),'Count']
Out[333]: 
7    9.0
Name: Count, dtype: float64
  • Related