Home > Enterprise >  Using filter(lambda, list) in python to clean data
Using filter(lambda, list) in python to clean data

Time:08-09

I'm web-scraping a website as a project. I am currently clearing the data. I have a list containing some information/sentences, but some are empty and I wanted to delete them.

My thought was to create a lambda function that identifies null and non-null values ​​to return False or True. Then I would put this function inside the filter() method and pass it to my list. So filter() would apply the function and delete the empty strings from the list.

enter image description here

CodePudding user response:

check x == ""

f = lambda x: x is not None and x != ""

CodePudding user response:

You don't need a lambda here. Use this:

lst = ['', 'abc', '', 'def', '', 1, 2, '']

list(filter(None, lst))

Output:

['abc', 'def', 1, 2]

CodePudding user response:

You can use the fact that:

  • bool(None) is False
  • bool("") (empty string) is False
  • bool("something") (non-empty string) is True
>>> info = ['', 'abc', '', 'def', '', None]
>>> f = lambda x: bool(x)
>>> list(filter(f, info))
['abc', 'def']

CodePudding user response:

You can use list comprehension instead of filter and get better performance.

res = [elem for elem in Mylist if not elem in [None, '']]

Benchmark:

from timeit import timeit
import random

Mylist = [random.choice(['',None,'a']) for _ in range(100)]

def check_bool():
    f = lambda x: bool(x)
    return list(filter(f, Mylist))

def lambda_if_else():
    f = lambda x: x is not None and x != ""
    return list(filter(f, Mylist))

def list_comprehension():
    return [elem for elem in Mylist if not elem in [None, '']]


for func in [check_bool, lambda_if_else, list_comprehension]:
    print(func.__name__, timeit(f"{func.__name__}()", globals=globals()))
    
print(list_comprehension() == lambda_if_else() == check_bool())

check_bool 21.95354559900079
lambda_if_else 19.536270918999435
list_comprehension 8.683593133999238
True
  • Related