I'm trying to search for a keyword in a dataframe and print the keyyword if found using the following code:
if df[df['description'].str.contains(keyword,case=False)]:
print(keyword)
else:
print("NOT FOUND")
I'm getting the following error message:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Any suggestions on how to fix this?
CodePudding user response:
Try this instead:
if df['description'].str.contains(keyword,case=False).any():
print(keyword)
else:
print("NOT FOUND")
df['description'].str.contains(keyword,case=False)
returns a column, the same length as df['description']
, containing True
or False
values, where each corresponds to the value of the same index in df['description']
. If that value contained the keyword
, the value in the returned Series is True
, otherwise, False
.
Calling .any()
on a Series object will return True
if at least one value in the Series is True
. If all of them are False
, .any()
will return False
.
CodePudding user response:
Your expression
df[df['description'].str.contains(keyword,case=False)]
Doesn't return a bool, it returns a subset of the original dataframe containing the rows that match the predicate (contains keyword). So as the error message implies you need to test if this dataframe is empty or not.