`companies = set(salarydf['company'])
companies = str(companies)
print(companies)
import re
re.findall("Tata Consultancy Services|TCS|Wipro|Infosys",companies)
salarydf.loc[df['company'].str.contains('Tata Consultancy Services|TCS|Wipro|Infosys')]`
** In the last line Getting Error that "ValueError: ** Cannot mask with non-boolean array containing NA / NaN values" I'm a beginner please help me out Thank You
CodePudding user response:
This happens because the column company
contains nan
.
You could use fillna()
prior to str
to ensure that all data are non null as follows:
salarydf.loc[df['company'].fillna('').str.contains('Tata Consultancy Services|TCS|Wipro|Infosys')]`
CodePudding user response:
Please use pandas isin() function. Make a list of values you want to look for.
df[df.company.isin(your_list)]
By doing this, you will get all the rows in dataframe where your_list values are present.
CodePudding user response:
You can use .findna() to check NaN in your dataset and .dropna() to drop all NaN (Not a Number) values.
salarydf.findna() salarydf.dropna()
salarydf.loc[df['company'].str.contains('Tata Consultancy Services|TCS|Wipro|Infosys')]`