Hi all I've got and input strings:
list_of_strings = ["apple", "orange ca", "pear sa", "banana sth"]
And I want to remove everything after multiple white spaces (more than 1), so end result is:
final_list_of_strings = ["apple", "orange ca", "pear", "banana"]
I've tried regex:
import re
regex_expression = r"(.*\s?)(\s{2,}.*)"
for name in list_of_strings:
regex_matching_groups = re.findall(regex_expression, name)
if regex_matching_groups:
name = regex_matching_groups[0][0]
but fails on multiple spaces ... Thank you for help!
CodePudding user response:
You can use re.sub
in a list comprehension:
import re
list_of_strings = ["apple", "orange ca", "pear sa", "banana sth"]
list_of_strings = [re.sub(r'\s{2}.*', '', x, flags=re.S) for x in list_of_strings]
print(list_of_strings)
# -> ['apple', 'orange ca', 'pear', 'banana']
See the Python demo.
The \s{2}.*
regex matches two whitespace chars and then the rest of the string (even if there are line break chars due to re.S
flag).