im trying to search through a csv file for keywords but right now I cant even find one string here is what I'm doing any solutions I converted the csv to a pandas dataframe I would like to add a id=ndication to the dataframe like a check or running count
import pandas as pd
import numpy as np
df = pd.read_csv('InventoryValue-byItem-6-25-22.csv')
df.apply(lambda columns: columns.astype(str).str.contains("VODKA").any(), axis=1)
df.to_csv('file_name.csv', index=False)
CodePudding user response:
it depends if you want to filter only in one column:
import pandas as pd
data={'number':np.random.uniform(0,20,size=5),'text1':["test VODKA","test","Nothing","Something","VODKA Orange"], 'text2':["no","VODKA sprite","yes","yes","no"] }
df=pd.DataFrame(data)
#If you are only looking in one column:
#cast the column as string...
df['text1'].astype(str)
print(df[df['text1'].astype(str).str.contains("VODKA",case=False)])
Result:
number text1 text2
0 17.505103 test VODKA no
4 17.585175 VODKA Orange no
Or in all columns:
import pandas as pd
data={'number':np.random.uniform(0,20,size=5),'text1':["test VODKA","test","Nothing","Something","VODKA Orange"], 'text2':["no","VODKA sprite","yes","yes","VODKA Orange"] }
df=pd.DataFrame(data)
#to look in all columns
dfFinal=pd.DataFrame()
for column in df.columns:
dfTemp=df[df[column].astype(str).str.contains("VODKA",case=False)]
dfFinal=dfFinal.append(dfTemp)
dfFinal.drop_duplicates(inplace=True)
print(dfFinal)
Result:
number text1 text2
0 4.690792 test VODKA no
4 8.835689 VODKA Orange VODKA Orange
1 17.707329 test VODKA sprite