Is there a way to split a list into smaller lists depending on a certain value. To clarify what I want, I have a list of nparrays like this:
list_full = [[[some data_1], [1 0 0 0 0]],
[[some data_2], [0 1 0 0 0]],
[[some data_3], [1 0 0 0 0]],
[[some data_4], [0 1 0 0 0]]]
I want to get lists like this :
list_1 = [[[some data_1], [1 0 0 0 0]], [[some data_3], [1 0 0 0 0]]]
list_2 = [[[some data_2], [0 1 0 0 0]], [[some data_4], [0 1 0 0 0]]]
As you can see, we regroup lists depends on list_full[i][1]
I have this code that I found in this toppic How to split a list into smaller lists python
from itertools import groupby
lst = [['ID1', 'A'],['ID1','B'],['ID2','AAA'], ['ID2','DDD']]
grp_lists = []
for i, grp in groupby(lst, key= lambda x: x[0]):
grp_lists.append(list(grp))
But it didn't working with me because I have numpy array as x[0], when I run this I got this error :
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Thank you.
CodePudding user response:
To avoid the error you can convert the np.array
to a list when it's the key. You also need to use x[1]
as the key and sort the list before grouping it
list_full = [[['some data_1'], np.array([1, 0, 0, 0, 0])],
[['some data_2'], np.array([0, 1, 0, 0, 0])],
[['some data_3'], np.array([1, 0, 0, 0, 0])],
[['some data_4'], np.array([0, 1, 0, 0, 0])]]
list_full.sort(key=lambda x: list(x[1]), reverse=True)
new_list = [list(g) for _, g in groupby(list_full, key=lambda x: list(x[1]))]
print(new_list[0]) # [[['some data_1'], array([1, 0, 0, 0, 0])], [['some data_3'], array([1, 0, 0, 0, 0])]]
print(new_list[1]) # [[['some data_2'], array([0, 1, 0, 0, 0])], [['some data_4'], array([0, 1, 0, 0, 0])]]
CodePudding user response:
I solved that by using the real represantation instead of one hot code. So I have added 'np.argmax' as following :
for i, grp in groupby(list_full, key=lambda x: np.argmax(x[1], axis=-1)):
grp_lists.append(list(grp))
CodePudding user response:
If you want to use itertools.groupby you can convert the numpy array to a list and perform the splitting. But make sure to sort the list first.
from itertools import groupby
import numpy as np
np_list = np.array(
[[['a', 'a'], [1, 0, 0, 0, 0]],
[['b', 'b'], [0, 1, 0, 0, 0]],
[['x', 'x'], [1, 0, 0, 0, 0]],
[['d', 'd'], [0, 1, 0, 0, 0]]])
lst = np_list.tolist()
lst.sort(key=lambda x: x[1])
lst_sorted = []
for i, grp in groupby(lst, key= lambda x: x[1]):
lst_sorted.append(list(grp))
In the above example your initial list is split in the following way:
[[['b', 'b'], [0, 1, 0, 0, 0]], [['d', 'd'], [0, 1, 0, 0, 0]]]
[[['a', 'a'], [1, 0, 0, 0, 0]], [['x', 'x'], [1, 0, 0, 0, 0]]]