Home > database >  Split a list into smaller lists depending on a certain value
Split a list into smaller lists depending on a certain value

Time:03-18

Is there a way to split a list into smaller lists depending on a certain value. To clarify what I want, I have a list of nparrays like this:

list_full = [[[some data_1], [1 0 0 0 0]], 
             [[some data_2], [0 1 0 0 0]],
             [[some data_3], [1 0 0 0 0]], 
             [[some data_4], [0 1 0 0 0]]]

I want to get lists like this :

list_1 = [[[some data_1], [1 0 0 0 0]], [[some data_3], [1 0 0 0 0]]] 
list_2 = [[[some data_2], [0 1 0 0 0]], [[some data_4], [0 1 0 0 0]]] 

As you can see, we regroup lists depends on list_full[i][1]

I have this code that I found in this toppic How to split a list into smaller lists python

from itertools import groupby

lst = [['ID1', 'A'],['ID1','B'],['ID2','AAA'], ['ID2','DDD']]
grp_lists = []
for i, grp in groupby(lst, key= lambda x: x[0]):
    grp_lists.append(list(grp))

But it didn't working with me because I have numpy array as x[0], when I run this I got this error :

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Thank you.

CodePudding user response:

To avoid the error you can convert the np.array to a list when it's the key. You also need to use x[1] as the key and sort the list before grouping it

list_full = [[['some data_1'], np.array([1, 0, 0, 0, 0])],
             [['some data_2'], np.array([0, 1, 0, 0, 0])],
             [['some data_3'], np.array([1, 0, 0, 0, 0])],
             [['some data_4'], np.array([0, 1, 0, 0, 0])]]

list_full.sort(key=lambda x: list(x[1]), reverse=True)
new_list = [list(g) for _, g in groupby(list_full, key=lambda x: list(x[1]))]

print(new_list[0]) # [[['some data_1'], array([1, 0, 0, 0, 0])], [['some data_3'], array([1, 0, 0, 0, 0])]]
print(new_list[1]) # [[['some data_2'], array([0, 1, 0, 0, 0])], [['some data_4'], array([0, 1, 0, 0, 0])]]

CodePudding user response:

I solved that by using the real represantation instead of one hot code. So I have added 'np.argmax' as following :

    for i, grp in groupby(list_full, key=lambda x: np.argmax(x[1], axis=-1)):
    grp_lists.append(list(grp))

CodePudding user response:

If you want to use itertools.groupby you can convert the numpy array to a list and perform the splitting. But make sure to sort the list first.

from itertools import groupby
import numpy as np

np_list = np.array(
            [[['a', 'a'], [1, 0, 0, 0, 0]], 
             [['b', 'b'], [0, 1, 0, 0, 0]],
             [['x', 'x'], [1, 0, 0, 0, 0]],
             [['d', 'd'], [0, 1, 0, 0, 0]]])
             
lst = np_list.tolist()

lst.sort(key=lambda x: x[1])

lst_sorted = []
for i, grp in groupby(lst, key= lambda x: x[1]):
    lst_sorted.append(list(grp))

In the above example your initial list is split in the following way:

[[['b', 'b'], [0, 1, 0, 0, 0]], [['d', 'd'], [0, 1, 0, 0, 0]]]
[[['a', 'a'], [1, 0, 0, 0, 0]], [['x', 'x'], [1, 0, 0, 0, 0]]]
  • Related