Home > Software engineering >  Import .txt file rowdies but only import given columns
Import .txt file rowdies but only import given columns

Time:05-05

I want to import a .txt file with n rows and 25 columns. As there are 10 000 000 rows I want to import them rowwise and only keep the first 7 of the 25 columns and then write the new row with 7 columns into a new list. This is what I tried so far but did not work:

results = []
with open('allCountries.txt', newline='') as inputfile:
    for row in csv.reader(inputfile):
        results.append(row[:,[0,1,2,3,4,5,6,7]])
print(results)

Error:

TypeError: list indices must be integers or slices, not tuple

The link to the data is the following, but the dataset is 2GB big. http://download.geonames.org/export/dump/allCountries.zip

Thank you for the help!

CodePudding user response:

You're trying to slice, but using the wrong syntax.

From the docs:

While indexing is used to obtain individual characters, slicing allows you to obtain substring: [example omitted] Slice indices have useful defaults; an omitted first index defaults to zero, an omitted second index defaults to the size of the string being sliced.

Although I didn't download your 2gb dataset to check and you didn't provide a sample row, from the error it looks like each row is structured as a list with each item in the list representing a column. If that is the case, try:

results = []
with open('allCountries.txt', newline='') as inputfile:
for row in csv.reader(inputfile):
    results.append(row[:7]])
print(results)
  • Related