Home > Mobile >  Python group by without Pandas' proprietary GroupBy
Python group by without Pandas' proprietary GroupBy

Time:09-02

I have an output like this:

{((0, 0), 'up'): 'v_1_0',
 ((0, 0), 'down'): 'v_0_0',
 ((0, 0), 'left'): 'v_0_0',
 ((0, 0), 'right'): 'v_0_1',
 ((0, 1), 'up'): 'v_1_1',
 ((0, 1), 'down'): 'v_0_1',
 ((0, 1), 'left'): 'v_0_0',
 ((0, 1), 'right'): 'v_0_2',
 ((0, 2), 'up'): 'v_1_2',
 ((0, 2), 'down'): 'v_0_2',
 ((0, 2), 'left'): 'v_0_1',
 ((0, 2), 'right'): 'v_0_3',
 ((0, 3), 'up'): 'v_1_3',
 ((0, 3), 'down'): 'v_0_3',
 ((0, 3), 'left'): 'v_0_2',
 ((0, 3), 'right'): 'v_0_3',
.
.
.
.

I'd like to arrange it like this:

{(0, 0): ['v_1_0', 'v_0_0', 'v_0_0', 'v_0_1'],
(0, 1): ['v_1_1', 'v_0_1', 'v_0_0', 'v_0_2'],
.
.

So, basically, discard the 'up', 'down', 'left', and 'right' parts, and club the values into an array by their keys.

I know I can use Pandas' GroupBy to do this but I'd like to avoid an extra module import if I can.

I've tried approaching it in different ways but nothing has worked. Any help is immensely appreciated.

CodePudding user response:

Try:

dct = {
    ((0, 0), "up"): "v_1_0",
    ((0, 0), "down"): "v_0_0",
    ((0, 0), "left"): "v_0_0",
    ((0, 0), "right"): "v_0_1",
    ((0, 1), "up"): "v_1_1",
    ((0, 1), "down"): "v_0_1",
    ((0, 1), "left"): "v_0_0",
    ((0, 1), "right"): "v_0_2",
    ((0, 2), "up"): "v_1_2",
    ((0, 2), "down"): "v_0_2",
    ((0, 2), "left"): "v_0_1",
    ((0, 2), "right"): "v_0_3",
    ((0, 3), "up"): "v_1_3",
    ((0, 3), "down"): "v_0_3",
    ((0, 3), "left"): "v_0_2",
    ((0, 3), "right"): "v_0_3",
}

out = {}
for (t, _), v in dct.items():
    out.setdefault(t, []).append(v)

print(out)

Prints:

{
    (0, 0): ["v_1_0", "v_0_0", "v_0_0", "v_0_1"],
    (0, 1): ["v_1_1", "v_0_1", "v_0_0", "v_0_2"],
    (0, 2): ["v_1_2", "v_0_2", "v_0_1", "v_0_3"],
    (0, 3): ["v_1_3", "v_0_3", "v_0_2", "v_0_3"],
}

CodePudding user response:

You can also use itertools.groupby from the itertools standard library:

itertools.groupby(iterable, key=None)

Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.

>>> from itertools import groupby
>>> from pprint import pprint
>>> dct = {((0, 0), 'up'): 'v_1_0',
...  ((0, 0), 'down'): 'v_0_0',
...  ((0, 0), 'left'): 'v_0_0',
...  ((0, 0), 'right'): 'v_0_1',
...  ((0, 1), 'up'): 'v_1_1',
...  ((0, 1), 'down'): 'v_0_1',
...  ((0, 1), 'left'): 'v_0_0',
...  ((0, 1), 'right'): 'v_0_2',
...  ((0, 2), 'up'): 'v_1_2',
...  ((0, 2), 'down'): 'v_0_2',
...  ((0, 2), 'left'): 'v_0_1',
...  ((0, 2), 'right'): 'v_0_3',
...  ((0, 3), 'up'): 'v_1_3',
...  ((0, 3), 'down'): 'v_0_3',
...  ((0, 3), 'left'): 'v_0_2',
...  ((0, 3), 'right'): 'v_0_3',
...
... }
>>> pprint({l:[x[1] for x in group] for l,group in groupby(dct.items(),key=lambda x:x[0][0])})
{(0, 0): ['v_1_0', 'v_0_0', 'v_0_0', 'v_0_1'],
 (0, 1): ['v_1_1', 'v_0_1', 'v_0_0', 'v_0_2'],
 (0, 2): ['v_1_2', 'v_0_2', 'v_0_1', 'v_0_3'],
 (0, 3): ['v_1_3', 'v_0_3', 'v_0_2', 'v_0_3']}

CodePudding user response:

This should get what you're looking for by just looping through your output.

groups = {}
for k in output.keys():
    v = output[k]
    if k[0] not in groups.keys():
        groups[k[0]] = []
    
    groups[k[0]].append(v)
    
  • Related