Home > front end >  how to use pandas isin function in 2d numpy array?
how to use pandas isin function in 2d numpy array?

Time:01-05

I have created a 2d numpy array with 2 rows and 5 columns.

import numpy as np
import pandas as pd

arr = np.zeros((2, 5))

arr[0] = [12, 94, 4, 4, 2]
arr[1] = [1, 3, 4, 12, 46]

I have also created a dataframe with two columns col1 and col2

list1 = [1,2,3,4,5]
list2 = [2,3,4,5,6]
df = pd.DataFrame({'col1': list1, 'col2': list2})

I used pandas isin function with col1 and col2 to create a boolean value list, like this:

df['col1'].isin(df['col2'])

output

0    False
1     True
2     True
3     True
4     True

Now I want to use these bool values to slice the 2d array that I have created before, I can do that for a single row but now for the whole 2d array at once:

print(arr[0][df['col1'].isin(df['col2'])])
print(arr[1][df['col1'].isin(df['col2'])])

output:

[94.  4.  4.  2.]
[ 3.  4. 12. 46.]

but when I do something like this:

print(arr[df['col1'].isin(df['col2'])])

But this gives the error:

IndexError: boolean index did not match indexed array along dimension 0; dimension is 2 but corresponding boolean dimension is 5

Is there a way to achieve this?

CodePudding user response:

You should slice on the second dimension of the array:

arr[:, df['col1'].isin(df['col2'])]

output:

array([[94.,  4.,  4.,  2.],
       [ 3.,  4., 12., 46.]])
  •  Tags:  
  • Related