if a have a data frame and I want to return the values in one column if I find a keyword in another. So below if I search for apple
I want the output to be [a,b]
like this:
names words
a apple
b apple
c pear
I would want a list that is:
[a,b]
I have found ways to return the boolean value using str.contains
, but not sure how to take the value from another column in the same row which will give me the name. There must be a post I cant find if anyone can direct me there.
CodePudding user response:
You could do
list(df[df['words'].str.contains('apple', na=False)]['names'])
resulting in
['a', 'b']
df['words'].str.contains('apple', na=False)
build a boolean pandas series for the condition, and taking care of eventual missing values in the column.- the series resulting from previous line is used filter the original dataframe df.
- in the dataframe resulting from previous line, the 'names' column is selected.
- in the dataframe resulting from previous line, the column is cas to a list.
Full code:
import io
import pandas as pd
data = """
names words
a apple
b apple
c pear
"""
df = pd.read_csv(io.StringIO(data), sep='\s ')
lst = list(df[df['words'].str.contains('apple')]['names'])
>>>print(lst)
['a', 'b']