im having trouble selecting certain collumns from a csv file. Any suggestions? here is my code.
import numpy
import csv
def load_metrics(filename):
"""A function to analyse and extract csv files"""
data_list = []
final_data = []
with open(filename) as csvfile:
file = csv.reader(csvfile)
for row in file:
data_list.append(row)
for data in data_list:
data_a = data[0:1]
data_b = data[7:13]
data_a.append(data_b)
final_data.append(data_a)
data = numpy.array(final_data)
return data
CodePudding user response:
I suggest using pandas:
import pandas as pd
df = pd.read_csv('yourfile.csv')
data = df.iloc[:, [0,1] list(range(7,13))]
date_numpy = data.to_numpy() #convert to numpy
you can also select by column name; example:
df.loc[:, ['name', 'age']]
CodePudding user response:
Your code has several problems. In this line:
data_a.append(data_b)
You probably expect data[0:1]
and data[7:13]
to be put together in a single list, instead of what actually happens, which is creating [data[0], [data[7], data[8], .., data[12]]
. What you wanted there is:
data_a.extend(data_b)
You do this one line at a time, collecting the results in a list of lists, and then turn this into a numpy array.
That's not impossible, but overly complicated. You could have also:
import numpy
def load_metrics(filename):
with open(filename) as csvfile:
file = csv.reader(csvfile)
data = [[data[0], *data[7:13]] for data in file]
return numpy.array(data)
data = load_metrics('data.csv')
print(data)
Or just:
import numpy
data = numpy.genfromtxt('data.csv', delimiter=',')
data = numpy.delete(data, range(1, 7), 1)
print(data)
(all this assuming your file has no column header)