Troubles creating an if loop inside a for loop to create multiple excel sheets with openpyxl-CodePudding

I have been working on a automation program to help me automate repetitive stuff i have to do on excel.

Here is the excel file I am extracting data from :

https://i.stack.imgur.com/f57sf.png

And here is a part of my actual code :

for row in range (4,ws_1.max_row):

        B = ws_1["B"   str(row)].value #NORMAL
        ws_2["B"   str(row)].value = B 
        C = ws_1["C"   str(row)].value 
        ws_2["C"   str(row)].value = C 
        D = ws_1["D"   str(row)].value 
        ws_2["D"   str(row)].value = D

Everything is working fine, but my next step is the following : I need to read the values on the column E, in a way that I will iterate through all the row as I am doing now but anytime I come across a 4 on the E column I need to save the actual file where I extract my values to, create a new workbook and keep going from the following row etc until I'm done with all the rows of the excel file.

I hope it makes sense.

I think I should use a If ws_1["N" str(row)].value != 4 : (my previous code)

else :

wb2.save(filename = 'whatever.xlsx')

But by doing so I am not iterating through all the cells of my worksheet, so any help would be really appreciated.

Thanks in advance !

CodePudding user response：

The correct (or at least most complete) approach is the read the data into a dataframe.

import pandas as pd

file = "path/to/file.xlsx"
df = pd.read_excel(...)

Then process the data.

# do things to the dataframe
...

Then write it back out to a file.

pd.to_excel()

CodePudding user response：

we use the pandas to read dataframe, it's easier, you can just read the data frame like

import pandas as pd
data = pd.read_excel('path to the file')

You can put some treatments that you like

data['D'] = 'You can put here the value that you want to'

After that you can just

data.to_excel('path you'll export to')