Home > other >  How to make sure Python goes into the end of a file line?
How to make sure Python goes into the end of a file line?

Time:04-12

Good morning, Because of memory issues, I'm trying to create a mean function that goes line by line in my file and calculate the mean of each column of my file. My file has 5000 columns and 20000 rows. However, when I print the output of this function, the last part of my list is filled with zeros (the value I put to initialize it). I tried writing the results one by one but it goes until 4482 instead of 5000. Is there a way to make sure he goes until the end ?

Here is my code:


def mean_by_line(file, size):
    calc=[]
    file=open(file,"r")
    line=file.readline()
    table_line=line.split(",")
    mean_vector=[0 for i in range(len(table_line)-1)]
    #We initialize the first one since we need it beforehand for the length
    for j in range(len(table_line)-1):
        calc.append(float(table_line[j]))
    #We get the other values
    for i in range(1,size):
        line = file.readline()
        table_line = line.split(",")
        for j in range(len(table_line)-1):
            calc[j] = float(table_line[j])
    #We calculate the average
    for j in range(len(table_line)-1):
        mean_vector[j]=calc[j]/size
        print(j, mean_vector[j])
    file.close()
    return mean_vector

Thanks in advance

CodePudding user response:

Let's assume that we have a file with comma-separated numbers. Also assume that the number count on each line is always the same (but unknown) and we want to find the arithmetic mean of each "column".

Then:

totals = None
line_count = 0

with open('atest.csv') as csv:
    for line in map(str.strip, csv):
        nums = line.split(',')
        if totals is None:
            totals = [0.0] * len(nums)
        for i, v in enumerate(map(float, nums)):
            totals[i]  = v
        line_count  = 1
for i in range(len(totals)):
    totals[i] /= line_count

print(totals)

CodePudding user response:

Thank you very much for all your answers. I checked again the input file and strangely, the last row didn't have 5k columns like the other ones. So this is an error while measuring the data. Thank you and sorry for such bad mistake from me....

  • Related