I want to import a .txt file with n rows and 25 columns. As there are 10 000 000 rows I want to import them rowwise and only keep the first 7 of the 25 columns and then write the new row with 7 columns into a new list. This is what I tried so far but did not work:
results = []
with open('allCountries.txt', newline='') as inputfile:
for row in csv.reader(inputfile):
results.append(row[:,[0,1,2,3,4,5,6,7]])
print(results)
Error:
TypeError: list indices must be integers or slices, not tuple
The link to the data is the following, but the dataset is 2GB big. http://download.geonames.org/export/dump/allCountries.zip
Thank you for the help!
CodePudding user response:
You're trying to slice, but using the wrong syntax.
From the docs:
While indexing is used to obtain individual characters, slicing allows you to obtain substring: [example omitted] Slice indices have useful defaults; an omitted first index defaults to zero, an omitted second index defaults to the size of the string being sliced.
Although I didn't download your 2gb dataset to check and you didn't provide a sample row, from the error it looks like each row is structured as a list with each item in the list representing a column. If that is the case, try:
results = []
with open('allCountries.txt', newline='') as inputfile:
for row in csv.reader(inputfile):
results.append(row[:7]])
print(results)