Home > Mobile >  Implement the progressing bar reading a csv file alive_progress
Implement the progressing bar reading a csv file alive_progress

Time:10-02

I am trying to implement a progressing bar into my code while it reads my csv file (and I would like to implement it to the others functions too).

However, I am not sure how to implement this code to my reading code, because it stays progressing, and it never ends

import pandas as pd
from alive_progress import alive_bar
import time

with alive_bar(100, theme='ascii') as bar:

    file = pd.read_csv('file.csv', 
                        sep = ';', 
                        skiprows = 56,
                        parse_dates = [['Date','Time']])
    bar()

And, what happens if I would like to apply a progressing bar to a for loop?

CodePudding user response:

How do I add a progress bar to this?

In general with progress bars you need some way of adding a hook to the actual read loop. In this case I would simply not bother: if you're going to use a high-level library like pandas, presumably it's because you don't want to manage the whole reading-parsing loop yourself.

How do I use a for loop?

This is much easier. From the docs:

from alive_progress import alive_it

for item in alive_it(items):   # <<-- wrapped items
    print(item)                # process each item

Why doesn't my bar update?

Because you only call bar() once, which is the function which updates the bar. alive_progress isn't magic: if you tell it you will need 100 iterations it expects you to call bar() 100 times. It will move the bar 1/100th forward every time, and from the time between calls to bar() it will calculate how fast you are going and how long you likely have to wait.

CodePudding user response:

You'd have to parse the file in chunks and get the number of lines beforehand to calculate the total number of chunks:

import pandas as pd
from alive_progress import alive_bar

filepath = "file.csv"

num_lines = sum(1 for _ in open(filepath, 'r'))
chunksize = 5000

reader = pd.read_csv(filepath, chunksize=chunksize)

with alive_bar(int(num_lines/chunksize)) as bar:
    for chunk in reader:
        process_chunk()
        bar()
        

The row counting wastes a lot of time of course, so I'd only recommend this if the processing takes much longer than the reading itself and you absolutely have to have a progress bar.

  • Related