I'm writing a function that splits by delimiter, removes numerical only values and whitespaces or empty indices. However I can't seem to get it to not print, or remove the empty indices that were split the the delimiter.
Say my sample is ABC//DEF/GH//I, I want it to split by "/" then remove the empty space that's produced. My output looks like this so far.
["ABC", "", "DEF", "GH", "", "I"]
What can I include to remove the "", bits?
def split_lines(lines, delimiter, remove = '[0-9] $'):
for line in lines:
tokens = line.split(delimiter)
tokens = [re.sub(remove, "", token) for token in tokens]
print(tokens)
CodePudding user response:
Try this one
lst = ["ABC", "", "DEF", "GH", "", "I"]
new_lst = list(filter(lambda e:e,lst))
print(new_lst)
OUTPUT
['ABC', 'DEF', 'GH', 'I']
If also want to remove ' '
a single space from the list then use this one
lst = ["ABC", " ", "DEF", "GH", " ", "I"]
new_lst = list(filter(lambda e:e.strip(),lst))
print(new_lst)
OUTPUT
['ABC', 'DEF', 'GH', 'I']
CodePudding user response:
Here's an example of how this could be done. Added some more input data to make matters clearer:
import re
def split_lines(lines, delimiter, remove='[0-9] $'):
for line in lines:
tokens = [t for t in line.split(delimiter) if t]
yield [re.sub(remove, '', t) for t in tokens]
for line in split_lines(['ABC//DEF/GH//I', 'ABC1//DEF2/GH3//I4'], '/'):
print(line)
Output:
['ABC', 'DEF', 'GH', 'I']
['ABC', 'DEF', 'GH', 'I']