Complicated list comprehension with many loop in Python-CodePudding

I am currently doing some list of comprehension and come across a problem while increasing the number of loops in it. My code so far is as following:

selected_sheet_names = []
selected_sheet_names.append([x for x in sheet_names if x.endswith("b1")])
selected_sheet_names.append([x for x in sheet_names if x.endswith("b2")])
selected_sheet_names.append([x for x in sheet_names if x.endswith("b3")])

sheet_names list contains different strings all of which end with b1, b2, or b3. If you want to check them in your code:

sheet_names = ['0.5C_1_b1', '0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1', 
'0.11C_1_b2', '0.57C_1_b2', '1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', 
'1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2', '1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', 
'1C_6_b3', '1C_7_b3', '1C_8_b3']

And if I want to print(selected_sheet_names) the results is as following:

[
    ['0.5C_1_b1', '0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1'], 
    ['0.11C_1_b2', '0.57C_1_b2', '1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', '1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2'], 
    ['1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', '1C_6_b3', '1C_7_b3', '1C_8_b3']
]

Exactly as I expected, but in case I want to have more x.endswith(some_string) as in the first code block, the code becomes too massive and, therefore, I think I should try to change the selected_sheet_names.append([x for x in sheet_names if x.endswith(some_string)]) which repeats many times to some other more complicated list comprehension which could iterate over some_list and do the same.

some_list = ["b1", "b2", "b3" ... ]

Could someone please suggest me something?

EDIT 1: I know that I can implement it with for loop, but in this example I am specifically interested in list of comprehension implementation, if possible. The for loop can be as following:

selected_sheet_names = []
for ending in some_list:
    selected_sheet_names.append([x for x in sheet_names if x.endswith(ending)])

EDIT 2 (Thanks to Pedro Maia):

If the data is contiguous (, but it is not my case) you can go with:

from itertools import groupby

selected_sheet_names = [list(l[1]) for l in groupby(sheet_names, lambda x: x[-2:])]

My bad that I showed you a list to be contiguous. In case your data is not contiguous, the output may look something like this:

[
    ['0.11C_1_b2'], 
    ['0.5C_1_b1'], 
    ['0.57C_1_b2'], 
    ['0.5C_2_b1', '1C_1_b1', '1C_2_b1', '1C_3_b1', '1C_4_b1', '1C_5_b1'], 
    ['1.14C_1_b2', '1.14C_2_b2', '1.14C_3_b2', '1.14C_4_b2', '1.14C_5_b2', '1.14C_6_b2', '1.14C_7_b2', '1.14C_8_b2'], 
    ['1C_1_b3', '1C_2_b3', '1C_3_b3', '1C_4_b3', '1C_5_b3', '1C_6_b3', '1C_7_b3', '1C_8_b3']
]

However, if you data IS contiguous, this method seems better

Thanks you guys for the replies!

CodePudding user response：

Simple nested listcomp matching your suggested form would loop over an anonymous tuple of the strings to check for:

selected_sheet_names = [[x for x in sheet_names if x.endswith(some_string)]
                        for some_string in ("b1", "b2", "b3")]

If you get some_list from somewhere else, or it gets too long to comfortably define inline, you can replace the anonymous tuple with some_list if it's already defined.

CodePudding user response：

Alternatively you can use groupby from the built-in itertools module:

from itertools import groupby

selected_sheet_names = [list(l[1]) for l in groupby(sheet_names, lambda x: x[-2:])]

Which provides a cleaner and better performance code since you don't iterate multiple unnecessary times