outer loop of two lists comprehension-CodePudding

I know it's possible to operate a list comprehension over a "total" combinations of two lists. For example:

letters = ['A', 'B', 'C']
numbers = [1,2,3]

def concat(letter, number):
    return letter   str(number)

Can be combined using:

combinations = [concat(letter, number) for letter in letters for number in numbers]

Which has the same output as

combinations = []
for letter in letters:
    for number in numbers:
        combinations.append(concat(letter, number))

Producing:

['A1', 'A2', 'A3', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3']

I'm trying to clean a defined set of characters from a list of strings. For instance:

unwanted = ['$', '@']
raw_lines = [
    'phra$se1',
    'phr@ase2'
]

clean_lines = []

for line in raw_lines:
    for char in unwanted:
        line = line.replace(char, '')
    clean_lines.append(line)

outputs:

['phrase1', 'phrase2']

I want to refactor it using a list comprehension, but I'm failing as it produces all possible combinations of removed characters:

clean_lines = [line.replace(char, '') for char in unwanted for line in raw_lines]

outputs

['phrase1', 'phr@ase2', 'phra$se1', 'phrase2']

I got the reason it occurs, it's obvious after thinking about the numbers and letters combinations. List comprehension writing the for as:

clean_lines = []
for line in raw_lines:
    for char in unwanted:
        clean_lines.append(line.replace(char, ''))

Which also outputs

['phrase1', 'phra$se1', 'phr@ase2', 'phrase2']

Is there a workaround for accessing the "outer loop" when using list comprehension?

CodePudding user response：

You may find the re module helpful for this:

For example:

import re
unwanted = ['$', '@']
raw_lines = [
    'phra$se1',
    'phr@ase2'
]
expr = f'[{"|".join(re.escape(c) for c in unwanted)}]'
clean_lines = [re.sub(expr, '', line) for line in raw_lines]
print(clean_lines)

Output:

['phrase1', 'phrase2']

CodePudding user response：

I would do it like this:

unwanted = ['$', '@']
raw_lines = [
    'phra$se1',
    'phr@ase2'
]
clean_lines = ["".join([ch for ch in line if ch not in unwanted]) 
               for line in raw_lines]