Home > OS >  How to calculate the median of a list of numbers from a CSV file?
How to calculate the median of a list of numbers from a CSV file?

Time:10-21

I was working on doing a short function on finding a median from a list.
Here is a portion of the csv file:

Index,Height(Inches),Weight(Pounds)
1,65.78331,112.9925
2,71.51521,136.4873
3,69.39874,153.0269
4,68.2166,142.3354
5,67.78781,144.2971
6,68.69784,123.3024
7,69.80204,141.4947
8,70.01472,136.4623
9,67.90265,112.3723
10,66.78236,120.6672
11,66.48769,127.4516
12,67.62333,114.143
13,68.30248,125.6107
14,67.11656,122.4618
15,68.27967,116.0866
16,71.0916,139.9975
17,66.461,129.5023
18,68.64927,142.9733
19,71.23033,137.9025
20,67.13118,124.0449
21,67.83379,141.2807
22,68.87881,143.5392
23,63.48115,97.90191
24,68.42187,129.5027
25,67.62804,141.8501
26,67.20864,129.7244
27,70.84235,142.4235

Can someone help me in this? I have tried to use Counter also to count the number of items. I want to find the median of the third column. My preexisting function is:

def median():
   n = (len(file_data))
   file_data.sort()
   if n%2==0:
    median1 = file_data[n//2]
    median2 = file_data[n//2-1]
    median = (median1 median2)/2
    mediankg1 = median/2.2046
   else:
      median = file_data[n//2]
      mediankg = median/2.2046
   print("MEDIAN")
   print("Median is "   str(median) " pounds")
   print("OR")
   print("Median is "   str(mediankg1) " kilograms")


median()

CodePudding user response:

Use pandas to read your CSV file then call median

CodePudding user response:

You should use pandas for this, as it has built in functions for these tasks. The install documentation is found here: https://pandas.pydata.org/docs/getting_started/install.html

The following code should be enough to get you started:

import pandas as pd

df = pd.read_csv('/path/to/file.csv', header=None)

median_col = df.median(axis=0) # Median of each column
median_row = df.median(axis=1) # Median of each row

median_row_3 = median_row[2] # Median of 3rd column

CodePudding user response:

This should work if you don't want to use pandas:

import csv
file_data = [float(row[2]) for row in csv.reader(open(csvfile), delimiter=',')]

n = len(file_data)
file_data.sort()
median = file_data[n//2]
if n%2==0:
    median = (median file_data[n//2-1])/2
mediankg1 = median/2.2046
print("MEDIAN")
print("Median is "   str(median) " pounds")
print("OR")
print("Median is "   str(mediankg1) " kilograms")
  • Related