Home > Software engineering >  python itertools dont load files into memory
python itertools dont load files into memory

Time:06-23

I have some what big files and I'm trying to get all combinations with this code

for text1, text2 in itertools.product(open('text1.txt'), open('text2.txt')):
    t3 = (text1.strip()   text2.strip())
    time.sleep(1)
    print(t3)

testing with small files it worked fine but when using big files nothing happens I'm guessing its loading the file into memory anyway so it doesn't load the whole file into memory

CodePudding user response:

This is documented:

Before product() runs, it completely consumes the input iterables, keeping pools of values in memory to generate the products. Accordingly, it is only useful with finite inputs.

Note, in this particular case, you may be able to do something like:

with open("text1.txt") as f1, open("text2.txt") as f2:
    for text1 in f1:
        for text2 in f2:
            # do some stuff
            t3 = (text1.strip()   text2.strip())
        f2.seek(0) # reset inner file cursor

This is possible due to the nature of file iterators - you can just seek to the beginning and the iterator is effectively reset (and this is nice and efficient too!). But this won't work with iterables or iterators in general, so itertools.product handles the general case by simply reifying two lists out fo the iterator

  • Related