Home > Net >  checking if a column contains string from a list, and outputs that string
checking if a column contains string from a list, and outputs that string

Time:04-17

I'm trying to check a column for a string of a list, and output that string if it's there, and I'm half way so far

k = ['a', 'e','o']
data = pd.DataFrame({"id":[1,2,3,4,5],"word":["cat","stick","door","dog","lung"]})

   id   word
0   1    cat
1   2  stick
2   3   door
3   4    dog
4   5   lung

I tried this

data["letter"] = data['word'].apply(lambda x: any([a in x for a in k]))

trying to get this

   id   word  letter
0   1    cat    a
1   2  stick   
2   3   door    o
3   4    dog    o
4   5   lung   

but I get this instead

   id   word  letter
0   1    cat    True
1   2  stick   False
2   3   door    True
3   4    dog    True
4   5   lung   False

CodePudding user response:

You can use built-in next function with generator expression. The second argument to the next function is the default, which will be returned if the iterator is exhausted.

data["letter"] = data["word"].apply(
    lambda x: next(
        (a for a in k if a in x), ""
    )
)

Complete code:

>>> import pandas as pd
>>>
>>> k = ["a", "e", "o"]
>>> data = pd.DataFrame(
...     {
...         "id": [1, 2, 3, 4, 5],
...         "word": [
...             "cat",
...             "stick",
...             "door",
...             "dog",
...             "lung",
...         ],
...     }
... )
>>>
>>> data["letter"] = data["word"].apply(
...     lambda x: next(
...         (a for a in k if a in x), ""
...     )
... )
>>> print(data)
   id   word letter
0   1    cat      a
1   2  stick
2   3   door      o
3   4    dog      o
4   5   lung
  • Related