Home > Mobile >  Complicated list comprehension with many loop in Python
Complicated list comprehension with many loop in Python

Time:11-25

I am currently doing some list of comprehension and come across a problem while increasing the number of loops in it. My code so far is as following:

selected_sheet_names = []
selected_sheet_names.append([x for x in sheet_names if x.endswith("b1")])
selected_sheet_names.append([x for x in sheet_names if x.endswith("b2")])
selected_sheet_names.append([x for x in sheet_names if x.endswith("b3")])

sheet_names list contains different strings all of which end with b1, b2, or b3. If you want to check them in your code:

sheet_names = ['0.5C_1_b1', '0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1', 
'0.11C_1_b2', '0.57C_1_b2', '1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', 
'1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2', '1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', 
'1C_6_b3', '1C_7_b3', '1C_8_b3']

And if I want to print(selected_sheet_names) the results is as following:

[
    ['0.5C_1_b1', '0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1'], 
    ['0.11C_1_b2', '0.57C_1_b2', '1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', '1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2'], 
    ['1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', '1C_6_b3', '1C_7_b3', '1C_8_b3']
]

Exactly as I expected, but in case I want to have more x.endswith(some_string) as in the first code block, the code becomes too massive and, therefore, I think I should try to change the selected_sheet_names.append([x for x in sheet_names if x.endswith(some_string)]) which repeats many times to some other more complicated list comprehension which could iterate over some_list and do the same.

some_list = ["b1", "b2", "b3" ... ]

Could someone please suggest me something?

EDIT 1: I know that I can implement it with for loop, but in this example I am specifically interested in list of comprehension implementation, if possible. The for loop can be as following:

selected_sheet_names = []
for ending in some_list:
    selected_sheet_names.append([x for x in sheet_names if x.endswith(ending)])

EDIT 2 (Thanks to Pedro Maia):

If the data is contiguous (, but it is not my case) you can go with:

from itertools import groupby

selected_sheet_names = [list(l[1]) for l in groupby(sheet_names, lambda x: x[-2:])]

My bad that I showed you a list to be contiguous. In case your data is not contiguous, the output may look something like this:

[
    ['0.11C_1_b2'], 
    ['0.5C_1_b1'], 
    ['0.57C_1_b2'], 
    ['0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1'], 
    ['1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', '1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2'], 
    ['1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', '1C_6_b3', '1C_7_b3', '1C_8_b3']
]

However, if you data IS contiguous, this method seems better

Thanks you guys for the replies!

CodePudding user response:

Simple nested listcomp matching your suggested form would loop over an anonymous tuple of the strings to check for:

selected_sheet_names = [[x for x in sheet_names if x.endswith(some_string)]
                        for some_string in ("b1", "b2", "b3")]

If you get some_list from somewhere else, or it gets too long to comfortably define inline, you can replace the anonymous tuple with some_list if it's already defined.

CodePudding user response:

Alternatively you can use groupby from the built-in itertools module:

from itertools import groupby

selected_sheet_names = [list(l[1]) for l in groupby(sheet_names, lambda x: x[-2:])]

Which provides a cleaner and better performance code since you don't iterate multiple unnecessary times

  • Related