I would like to upgrade my scrip to analyze data. Instead of manualy checking row number to be header line i need to find the index row that contains specific string. Now i read csv directly to pandas dataframe with headerline defined like this:
df1 = pd.read_csv('sensor_1.csv', sep=',', header=101)
How to read csv and find line with "Scan Number" text and put this variable to header definition?
I tried this:
FileList = (glob.glob("sensor_1.csv"))
for FileToProcess in FileList:
with open(FileToProcess) as readfile:
for cnt,line in enumerate(readfile):
if "Scan Number" in line:
cnt
readfile.close
df1 = pd.read_csv('sensor_1.csv', sep=',', header= cnt)
But this gives highest index and error at the end :/ Could you please help?
Thanks Paulina
CodePudding user response:
fille_ = open('sensor_1.csv', 'r')
lines = fille_.readlines()
cnt = 0
for i in range(0, len(lines)):
if lines[i].find('Scan Number') !=-1:
cnt = i
break
print(cnt)
When the search phrase is found in the string, the loop will print the index of the string and the loop will stop.
CodePudding user response:
you dont need to use globe library it's use for listing files inside a directory ... but you have the filename already
when you use 'with' command you don't need to close file too . it'll be closed after indention
try this
import pandas as pd , csv
rownumber_list = []
with open("Salary_Data.csv", 'r') as file:
csvreader = csv.reader(file)
for cnt,line in enumerate(csvreader):
if "Scan Number" in line:
rownumber_list.append(cnt)
df1 = pd.DataFrame(rownumber_list , columns=['row_number'])