Home > database >  Group data in a list with missing values as delimeter
Group data in a list with missing values as delimeter

Time:10-11

I have an extended question following this question: I want to group data in a list with missing values as delimiter.

For example I have a list like this:

data = [1, 2, 3, nan, nan, nan, 4, 5, 6, nan, nan, 7, nan, 8, 9, nan, 0, 0, 'hello']
from math import nan

data = [1, 2, 3, nan, nan, nan, 4, 5, 6, nan, nan, 7, nan, 8, 9, nan, 0, 0, 'hello']

from itertools import groupby

g = groupby(data, key=type)  # group by type of the data, int != float

groups = [list(a[1]) for a in g if a[0] != float]  

print(groups)  

I got this results:

[[1, 2, 3], [4, 5, 6], [7], [8, 9], [0, 0], ['hello']]

Expected results:

[[1, 2, 3], [4, 5, 6], [7], [8, 9], [0, 0, 'hello']]

What I have tried by adjusting the groups variable but it did not work:

groups = [list(a[1]) for a in g if a[0] != float or a[0] == str]

How can I adjust this in order to mix the value between integer and string?

CodePudding user response:

You can use isinstance() in groupby key function:

from itertools import groupby

g = groupby(data, key=lambda v: isinstance(v, float))
groups = [list(a[1]) for a in g if not a[0]]
print(groups)

Prints:

[[1, 2, 3], [4, 5, 6], [7], [8, 9], [0, 0, 'hello']]

CodePudding user response:

from math import nan
data = [1, 2, 3, nan, nan, nan, 4, 5, 6, nan, nan, 7, nan, 8, 9, nan, 0, 0, 'hello']

result = [[]]

for value in data:
    if value is not nan:
        result[-1].append(value)
    else:
        if result[-1]:
            result.append([])
if not result[-1]:
    del result[-1]
print(result)

CodePudding user response:

You can do it with groupby filter

In [1]: list(filter(lambda x: x[0] is not nan, [list(l) for _, l in groupby(data, key=lambda x: x is nan)]))
Out[1]: [[1, 2, 3], [4, 5, 6], [7], [8, 9], [0, 0, 'hello']]
  • Related