Home > Software design >  Assigning attributes from CSV
Assigning attributes from CSV

Time:06-16

I have a csv file with the column titles of name, capital, continent, difficulty, like so

name,capital,continent,difficulty
Afghanistan,Kabul,Asia,3
China,Bejing,Asia,1
import csv

class Country:
    def __init__(self, name, capital, continent, difficulty):
        self.name = name
        self.capital = capital
        self.continent = continent
        self.difficulty = difficulty

CountryArray = []
file = open('Countries.csv', 'r')
for row in file:
    SpecificCountry = Country(row['name'], row['capital'], row['continent'], row['difficulty'])
print(CountryArray)

The idea is to assign every country in my huge csv file to an object named country with its attributes, for later use in a quiz game. The error I encounter is

TypeError: string indices must be integers

I am quite new to using csv so don't understand my error.

CodePudding user response:

This line is the problem:

for row in file:

row will be a string, one line of the file at a time.

You wanted something like this:

import csv

with open('Countries.csv', 'r') as f:
    for row in csv.reader(f):
        ...

The with is an improvement that ensure the file will be closed once the with block completes, it's not part of the answer per se, but a good habit to develop.

The csv.reader() takes a file and returns a generator that generates the records in the csv file one at a time, as a list of values.

You appear to want to access the rows by their column name, not by their index (and you're probably not interested in the header as a data row), so this is a bit more complete:

country_array = []
with open('Countries.csv', 'r') as f:
    cr = csv.reader(f)
    header = next(cr)
    indices = tuple(map(lambda x: header.index(x), ['name', 'capital', 'continent', 'difficulty']))
    for row in cr:
        country_array.append(Country(*(row[i] for i in indices)))

But at that point, things are getting so complicated, you might consider using pandas' read_csv() instead. (as @mkrieger1 correctly points out, a simpler solution using csv would be to use csv.DictReader - many ways to skin a cat)

If you know the positions of the columns (let's say there are in order), this is simpler:

with open('Countries.csv', 'r') as f:
    next(f)  # skip the header
    for row in csv.reader(f):
        country_array.append(Country(row[0], row[1], row[2], row[3]))

CodePudding user response:

Here's a simple way of doing it using @mkrieger1's suggestion of using a csv.DictReader. Note that the __repr__() method I added to your class to make printing one show the values of its attributes isn't a requirement.

import csv

class Country:
    def __init__(self, name, capital, continent, difficulty):
        self.name = name
        self.capital = capital
        self.continent = continent
        self.difficulty = difficulty

    def __repr__(self):
        classname, name, capital, continent, difficulty = (
            type(self).__name__, self.name, self.capital, self.continent,
            int(self.difficulty))
        return f'{classname}({name=!r}, {capital=!r}, {continent=!r}, {difficulty=!r})'

CountryArray = []
with open('Countries.csv', 'r', newline='') as file:
    for row in csv.DictReader(file):
        CountryArray.append(Country(**row))

for country in CountryArray:
    print(country)

Sample output:

Country(name='Afghanistan', capital='Kabul', continent='Asia', difficulty=3)
Country(name='China', capital='Bejing', continent='Asia', difficulty=1)
  • Related