Home > Back-end >  Selecting certain colums from a csv file
Selecting certain colums from a csv file

Time:05-20

im having trouble selecting certain collumns from a csv file. Any suggestions? here is my code.

import numpy
import csv


def load_metrics(filename):
    
    """A function to analyse and extract csv files"""
    
    data_list = []
    final_data = []
    with open(filename) as csvfile:
        
        file = csv.reader(csvfile)
        for row in file:
            data_list.append(row)
            
    for data in data_list:
        data_a = data[0:1]
        data_b = data[7:13]
        data_a.append(data_b)
        final_data.append(data_a)
        
    data = numpy.array(final_data)
    return data

CodePudding user response:

I suggest using pandas:

import pandas as pd

df = pd.read_csv('yourfile.csv')

data = df.iloc[:, [0,1] list(range(7,13))]

date_numpy = data.to_numpy() #convert to numpy

you can also select by column name; example:

df.loc[:, ['name', 'age']]

CodePudding user response:

Your code has several problems. In this line:

        data_a.append(data_b)

You probably expect data[0:1] and data[7:13] to be put together in a single list, instead of what actually happens, which is creating [data[0], [data[7], data[8], .., data[12]]. What you wanted there is:

        data_a.extend(data_b)

You do this one line at a time, collecting the results in a list of lists, and then turn this into a numpy array.

That's not impossible, but overly complicated. You could have also:

import numpy

def load_metrics(filename):
    with open(filename) as csvfile:
        file = csv.reader(csvfile)

        data = [[data[0], *data[7:13]] for data in file]

    return numpy.array(data)

data = load_metrics('data.csv')
print(data)

Or just:

import numpy

data = numpy.genfromtxt('data.csv', delimiter=',')
data = numpy.delete(data, range(1, 7), 1)

print(data)

(all this assuming your file has no column header)

  • Related