Home > Back-end >  Multiple list comprehensions return empty list with context manager
Multiple list comprehensions return empty list with context manager

Time:11-18

I'm reading in a zipped csv file and would like to extract only specific columns without using pandas. My current code only returns a list for the first list comprehension, but not for the following ones. How can I extract multiple columns while using a context manager?

Input file:

col1,col2,col3
1,2,3
a,b,c

My code

import gzip
import csv
import codecs

with gzip.open(r"myfile.csv.gz", "r") as f:
    content = csv.reader(codecs.iterdecode(f, "utf-8"))

    col_2 = [row[1] for row in content] # Returns [2, "b"]
    col_3 = [row[2] for row in content] # Returns []

Expected output:

col_2: [2, "b"]
col_3: [3, "c"]

CodePudding user response:

The issue is not due to the context manager but to the generator that can only be read once.

You can duplicate it using itertools.tee:

import gzip
import csv
import codecs

with gzip.open(r"myfile.csv.gz", "r") as f:
    content = csv.reader(codecs.iterdecode(f, "utf-8"))
    from itertools import tee

    c1, c2 = tee(content) # from now on, do not use content anymore
    
    col_2 = [row[1] for row in c1]
    col_3 = [row[2] for row in c2]

output:

>>> col_2
['col2', '2', 'b']

>>> col_3
['col3', '3', 'c']

using a classical loop

A better method however would be to use a classical loop. This avoids having to loop over the values twice:

with gzip.open(r"myfile.csv.gz", "r") as f:
    content = csv.reader(codecs.iterdecode(f, "utf-8"))

    col_2 = []
    col_3 = []
    for row in content:
        col_2.append(row[1])
        col_3.append(row[2])
  • Related