Home > Net >  Filter a dataframe column for a keyword, return seperate column value (name) from the row where each
Filter a dataframe column for a keyword, return seperate column value (name) from the row where each

Time:11-20

if a have a data frame and I want to return the values in one column if I find a keyword in another. So below if I search for apple I want the output to be [a,b]

like this:

names words
a     apple
b     apple
c     pear

I would want a list that is: [a,b]

I have found ways to return the boolean value using str.contains, but not sure how to take the value from another column in the same row which will give me the name. There must be a post I cant find if anyone can direct me there.

CodePudding user response:

You could do

list(df[df['words'].str.contains('apple', na=False)]['names'])

resulting in

['a', 'b']
  1. df['words'].str.contains('apple', na=False) build a boolean pandas series for the condition, and taking care of eventual missing values in the column.
  2. the series resulting from previous line is used filter the original dataframe df.
  3. in the dataframe resulting from previous line, the 'names' column is selected.
  4. in the dataframe resulting from previous line, the column is cas to a list.

Full code:

import io
import pandas as pd
data = """
names words
a     apple
b     apple
c     pear
"""
df = pd.read_csv(io.StringIO(data), sep='\s ')

lst = list(df[df['words'].str.contains('apple')]['names'])


>>>print(lst)

['a', 'b']
  • Related