I have got a 2D array a
with size 2, 1403
and a list b
which has 2 list.
a.shape = (2, 1403) # a is 2D array, each row has got unique elements.
len(b) = 2 # b is list
len(b[0]), len(b[1]) = 415, 452 # here also both the list inside b has got unique elements
all the elements present in b[0] and b[1]
is present in a[0] and a[1]
respectively
Now i want to rearrange elements of a
based on elements of b
. I want to rearrange such that all the elements in b[0]
which is also present in a[0]
should come in the ending of a[0]
, meaning new a
should be such that a[0][:-len(b[0])] = b[0]
, similarly a[1][:-len(b[1])] = b[1]
.
Toy Example
a
has got elements like [[1,2,3,4,5,6,7,8,9,10,11,12],[1,2,3,4,5,6,7,8,9,10,11,12]
b
has got elements like [[5, 9, 10], [2, 6, 8, 9, 11]]
new_a
becomes [[1,2,3,4,6,7,8,11,12,5,9,10], [1,3,4,5,7,10,12,2,6,8,9,11]]
I have written a code which loops over all the element which becomes very slow, it's shown below
a_temp = []
remove_temp = []
for i, array in enumerate(a):
a_temp_inner = []
remove_temp_inner = []
for element in array:
if element not in b[i]:
a_temp_inner.append(element) # get all elements first which are not present in b
else:
remove_temp_inner.append(element) #if any element present in b, remove it from main array
a_temp.append(a_temp_inner)
remove_temp.append(b_temp_inner)
a_temp = torch.tensor(a_temp)
remove_temp = torch.tensor(remove_temp)
a = torch.cat((a_temp, remove_temp), dim = 1)
Can anyone please help me with some faster implementation that works better than this
CodePudding user response:
Here is my approach:
index_ = np.array([[False if i in d else True for i in c] for c, d in zip(a,b)])
arr_filtered =[[np.extract(ind, c) for c, d, ind in zip(a,b,index_)], [np.extract(np.logical_not(ind), c) for c, d, ind in zip(a,b, index_)]]
arr_final = ar = np.array([np.concatenate((i, j)) for i, j in zip(*arr_filtered)])
CodePudding user response:
Assuming a is a np.array
, b is a list
you can use
np.array([np.concatenate((i[~np.in1d(i, j)], j)) for i, j in zip(a,b)])
Output
array([[ 1, 2, 3, 4, 6, 7, 8, 11, 12, 5, 9, 10],
[ 1, 3, 4, 5, 7, 10, 12, 2, 6, 8, 9, 11]])
Can be micro-optimized if b contains empty lists
np.array([np.concatenate((i[~np.in1d(i, j)], j)) if j else i for i, j in zip(a,b)])
In my benchmarks, for np.arrays
with less than ~100 elements converting .tolist()
is faster than np.concatenate
np.array([i[~np.in1d(i, j)].tolist() j for i, j in zip(a,b)])
Data example and imports for this solution
import numpy as np
a = np.array([
[1,2,3,4,5,6,7,8,9,10,11,12],
[1,2,3,4,5,6,7,8,9,10,11,12]
])
b = [[5, 9, 10],
[2, 6, 8, 9, 11]]