I know it's possible to operate a list comprehension over a "total" combinations of two lists. For example:
letters = ['A', 'B', 'C']
numbers = [1,2,3]
def concat(letter, number):
return letter str(number)
Can be combined using:
combinations = [concat(letter, number) for letter in letters for number in numbers]
Which has the same output as
combinations = []
for letter in letters:
for number in numbers:
combinations.append(concat(letter, number))
Producing:
['A1', 'A2', 'A3', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3']
I'm trying to clean a defined set of characters from a list of strings. For instance:
unwanted = ['$', '@']
raw_lines = [
'phra$se1',
'phr@ase2'
]
clean_lines = []
for line in raw_lines:
for char in unwanted:
line = line.replace(char, '')
clean_lines.append(line)
outputs:
['phrase1', 'phrase2']
I want to refactor it using a list comprehension, but I'm failing as it produces all possible combinations of removed characters:
clean_lines = [line.replace(char, '') for char in unwanted for line in raw_lines]
outputs
['phrase1', 'phr@ase2', 'phra$se1', 'phrase2']
I got the reason it occurs, it's obvious after thinking about the numbers and letters combinations. List comprehension writing the for as:
clean_lines = []
for line in raw_lines:
for char in unwanted:
clean_lines.append(line.replace(char, ''))
Which also outputs
['phrase1', 'phra$se1', 'phr@ase2', 'phrase2']
Is there a workaround for accessing the "outer loop" when using list comprehension?
CodePudding user response:
You may find the re module helpful for this:
For example:
import re
unwanted = ['$', '@']
raw_lines = [
'phra$se1',
'phr@ase2'
]
expr = f'[{"|".join(re.escape(c) for c in unwanted)}]'
clean_lines = [re.sub(expr, '', line) for line in raw_lines]
print(clean_lines)
Output:
['phrase1', 'phrase2']
CodePudding user response:
I would do it like this:
unwanted = ['$', '@']
raw_lines = [
'phra$se1',
'phr@ase2'
]
clean_lines = ["".join([ch for ch in line if ch not in unwanted])
for line in raw_lines]