Home > Software engineering >  Python find elements in array A but not in array B
Python find elements in array A but not in array B

Time:04-26

I'm trying to find the difference between the 2 arrays

arrayA = np.array(['A1', 'A2', 'A3'])
arrayB = np.array(['A1', 'A2', 'A3', 'A4', 'A5', 'A6'])

I'm trying to get

difference = ['A4', 'A5', 'A6']

How can I do this, thank you

CodePudding user response:

Use numpy's setdiff:

np.setdiff1d(arrayA, arrayB)

Also - is there any special reason for which this needs to be a numpy array? You could simply use sets and then the minus operator: set(arrayA) - set(arrayB)

CodePudding user response:

[i for i in arrayB if i not in arrayA]

CodePudding user response:

You can use the python set features for this:

import numpy as np
a = np.array(['A1', 'A2', 'A3'])
b = np.array(['A1', 'A2', 'A3', 'A4', 'A5', 'A6'])
print(set(b)-set(a))

Output:

{'A6', 'A5', 'A4'}

Or just comprehension:

import numpy as np
a = np.array(['A1', 'A2', 'A3'])
b = np.array(['A1', 'A2', 'A3', 'A4', 'A5', 'A6'])
print([i for i in b if i not in a])

Output:

['A4', 'A5', 'A6']

CodePudding user response:

You can use sets:

difference = list(set(arrayB) - set(arrayA))

Output:

['A4', 'A6', 'A5']

CodePudding user response:

As pointed out by this great answer, you can use the np.setdiff1d() method:

import numpy as np

arrayA = np.array(['A1', 'A2', 'A3'])
arrayB = np.array(['A1', 'A2', 'A3', 'A4', 'A5', 'A6'])

print(np.setdiff1d(arrayB, arrayA))

Output

['A4' 'A5' 'A6']

But the order of the elements will not be kept, as the result will always be sorted in ascending order. Observe:

import numpy as np


arrayA = np.array(['A1', 'A2', 'A3'])
arrayB = np.array(['A1', 'A2', 'A3', 'A4', 'A6', 'A5']) # Swapped 5 and 6

print(np.setdiff1d(arrayB, arrayA))

Output:

['A4' 'A5' 'A6']

If you want to keep the order, you can use the np.in1d() method:

import numpy as np

arrayA = np.array(['A1', 'A2', 'A3'])
arrayB = np.array(['A1', 'A2', 'A3', 'A4', 'A6', 'A5']) # Swapped 5 and 6

print(arrayB[~np.in1d(arrayB, arrayA)])

Output:

['A4' 'A6' 'A5']
  • Related