Home > other >  Function that splits by delimiter, removes numerical only values and whitespaces or empty indices
Function that splits by delimiter, removes numerical only values and whitespaces or empty indices

Time:05-03

I'm writing a function that splits by delimiter, removes numerical only values and whitespaces or empty indices. However I can't seem to get it to not print, or remove the empty indices that were split the the delimiter.

Say my sample is ABC//DEF/GH//I, I want it to split by "/" then remove the empty space that's produced. My output looks like this so far.

["ABC", "", "DEF", "GH", "", "I"]

What can I include to remove the "", bits?

def split_lines(lines, delimiter, remove = '[0-9] $'):
  for line in lines:
    tokens = line.split(delimiter)
    tokens = [re.sub(remove, "", token) for token in tokens]
    print(tokens)

CodePudding user response:

Try this one

lst = ["ABC", "", "DEF", "GH", "", "I"]


new_lst = list(filter(lambda e:e,lst))
print(new_lst)

OUTPUT

['ABC', 'DEF', 'GH', 'I']

If also want to remove ' ' a single space from the list then use this one

lst = ["ABC", " ", "DEF", "GH", " ", "I"]


new_lst = list(filter(lambda e:e.strip(),lst))
print(new_lst)

OUTPUT

['ABC', 'DEF', 'GH', 'I']

CodePudding user response:

Here's an example of how this could be done. Added some more input data to make matters clearer:

import re
def split_lines(lines, delimiter, remove='[0-9] $'):
    for line in lines:
        tokens = [t for t in line.split(delimiter) if t]
        yield [re.sub(remove, '', t) for t in tokens]

for line in split_lines(['ABC//DEF/GH//I', 'ABC1//DEF2/GH3//I4'], '/'):
    print(line)

Output:

['ABC', 'DEF', 'GH', 'I']
['ABC', 'DEF', 'GH', 'I']
  • Related