I need to filter a pandas dataframe using a function on only one column of string.
Here an example of dataframe :
ID Titles Values
0 1 title1 value1
1 2 title2 value2
2 3 title3 value3
...
I have a complex function :
def checkTitle(title:str) -> bool :
...
And I want to filter the first dataframe with this function on the column Titles, with only the rows where the function send True.
I try something like that but it doesn't return anything usable :
df = df.apply(checkTitle(df["Titles"]),axis=1)
Can you help please ?
CodePudding user response:
You can apply the function to just one column of the dataframe and then use the resulting Boolean series to select the rows:
select = df.Titles.apply(checkTitle)
df = df.loc[select, :]
CodePudding user response:
I think this might be a solution for you.
def checkTitle(title:str) -> bool:
if title == 'title2':
return True
else:
return False
df = pd.DataFrame({'ID': [1, 2, 3, 4], 'Titles': ['title1', 'title2', 'title2', 'title3'], 'Values': ['value1', 'value2', 'value2', 'value3']})
mask = df.Titles.apply(checkTitle)
df[mask]
I don't know your function in detail, but you need to return both bool values True|False to slice the dataframe.
I hope this solution helps
Regards,