I'm trying to check a column for a string of a list, and output that string if it's there, and I'm half way so far
k = ['a', 'e','o']
data = pd.DataFrame({"id":[1,2,3,4,5],"word":["cat","stick","door","dog","lung"]})
id word
0 1 cat
1 2 stick
2 3 door
3 4 dog
4 5 lung
I tried this
data["letter"] = data['word'].apply(lambda x: any([a in x for a in k]))
trying to get this
id word letter
0 1 cat a
1 2 stick
2 3 door o
3 4 dog o
4 5 lung
but I get this instead
id word letter
0 1 cat True
1 2 stick False
2 3 door True
3 4 dog True
4 5 lung False
CodePudding user response:
You can use built-in next
function with generator expression. The second argument to the next
function is the default, which will be returned if the iterator is exhausted.
data["letter"] = data["word"].apply(
lambda x: next(
(a for a in k if a in x), ""
)
)
Complete code:
>>> import pandas as pd
>>>
>>> k = ["a", "e", "o"]
>>> data = pd.DataFrame(
... {
... "id": [1, 2, 3, 4, 5],
... "word": [
... "cat",
... "stick",
... "door",
... "dog",
... "lung",
... ],
... }
... )
>>>
>>> data["letter"] = data["word"].apply(
... lambda x: next(
... (a for a in k if a in x), ""
... )
... )
>>> print(data)
id word letter
0 1 cat a
1 2 stick
2 3 door o
3 4 dog o
4 5 lung