For context, I wanted to find the most optimal/fastest way to remove unknown elements and rearranging them in numpy arrays [can be in any order]
i.e. I want to rearrange columns_2 to columns_1 and then use the rearranged indices on a numpy array
columns_1 = ["one", "two", "three", "four"]
columns_2 = ["one", "three", "four", "two", "extra"]
One solution I could come up with is like
column_idx = {col: idx for idx, col in enumerate(columns_2)}
array[:, [column_idx[col] for col in columns_1]]
Is there any better/faster alternative?
Note: Solution must fail if any of the elements of columns_1 is missing
CodePudding user response:
list.index(item)
returns the first index that matches item
in a list. You can do this:
[columns_2.index(col) for col in columns_1]
And then create your array. Since it won't have to iterate over unneeded elements of columns_2
it will be faster. The size of the benefit will depend on the overlap of the lists.