Home > Net >  How to search specific word in csv file with pandas
How to search specific word in csv file with pandas

Time:11-09

df:

   first     last                   email
0  Corey  Schafer  [email protected]
1   Jane      Doe       [email protected]
2   John      Doe       [email protected]

From a big CSV file, how can I find a specific word like John, without knowing on what column or row he is? If there are several names with John, can I get all the info in the row or column where the names are?

CodePudding user response:

That's the way to do i believe.

import pandas as pd

df = pd.read_csv('data.csv')
df[df['first'].str.contains('John')] # returns all rows where John in the column 'first'
df[df['first'].str.contains('John')].index.tolist() # get the index of the rows

The contains method is case sensitive, to make it case insensitive you can do something like that:

df["first"].str.contains("John", case=False)

To find in a header column (like the first row)

df.columns.get_loc("first")  # Output : 0 (the column index)

To find in a specific column

df["first"].str.contains("John").any()  # Output : True

To find in a specific row

df.loc[0].str.contains("John").any()  # Output : True

If you want to get only row index

df[df["first"] == "John"].index[0]

CodePudding user response:

To search an entire DataFrame for a given value without knowing which column that value will be in, we can use .applymap() in conjunction with .any(axis=1):

search_str = 'John'
search = df[df.applymap(lambda x: x == search_str).any(axis=1)]

search will be a view of your original df with only the rows where any column value is 'John'.

  • Related