Home > Software design >  Searching a pandas dataframe for multiple strings
Searching a pandas dataframe for multiple strings

Time:11-25

I have a dataframe (df) with a column 'Names' and I have a separate list of strings which are of the form:

info = ['AAA.123 456.789', 'BBB.987 654.321', 'CCC.321-654.987']

and so on. I want to search the 'Names' column in df using strings in the list and store the result in a separate dataframe (df2). I used:

df2 = df.loc[df['Names'].str.contains('|'.join(info))]

however the output for df2 (in the spyder variable explorer) was either an empty dataframe or only one of the results was returned. I'm not sure how I can fix this so any advice would be appreciated - thanks!

Edit

Index Names Quantity 1 Quantity 2 Quantity 3 Quantity 4
0 AAA 12.3 4.56 7.89 10.1112
1 BBB 3.21 65.4 98.7 1.21110
2 CCC 456.23 1.23 10101 101.112
3 DDD 6.4 3.21 0.2029 1211.10

is a sample of df's contents (it's 6 columns and a very high number of rows)

Edit 2

Have relabelled 'list' as 'info' on the suggestion of Serge in comments

CodePudding user response:

You can do this. If your df is

df = pd.DataFrame({"Names":['AAA.123 456.789', "BBB.987 654.321", "W1234", "A_aa_1 .", "Z54"], "col1":[1,2,3,4,5]})

info = ['AAA.123 456.789', 'BBB.987 654.321', 'CCC.321-654.987']

and

df2 = df[df['Name'].isin(info)]

gives:

    Names  col1
0  AAA.123 456.789     1
1  BBB.987 654.321     2
  • Related