I have been working on a automation program to help me automate repetitive stuff i have to do on excel.
Here is the excel file I am extracting data from :
https://i.stack.imgur.com/f57sf.png
And here is a part of my actual code :
for row in range (4,ws_1.max_row):
B = ws_1["B" str(row)].value #NORMAL
ws_2["B" str(row)].value = B
C = ws_1["C" str(row)].value
ws_2["C" str(row)].value = C
D = ws_1["D" str(row)].value
ws_2["D" str(row)].value = D
Everything is working fine, but my next step is the following : I need to read the values on the column E, in a way that I will iterate through all the row as I am doing now but anytime I come across a 4 on the E column I need to save the actual file where I extract my values to, create a new workbook and keep going from the following row etc until I'm done with all the rows of the excel file.
I hope it makes sense.
I think I should use a If ws_1["N" str(row)].value != 4 : (my previous code)
else :
wb2.save(filename = 'whatever.xlsx')
But by doing so I am not iterating through all the cells of my worksheet, so any help would be really appreciated.
Thanks in advance !
CodePudding user response:
The correct (or at least most complete) approach is the read the data into a dataframe.
import pandas as pd
file = "path/to/file.xlsx"
df = pd.read_excel(...)
Then process the data.
# do things to the dataframe
...
Then write it back out to a file.
pd.to_excel()
CodePudding user response:
we use the pandas to read dataframe, it's easier, you can just read the data frame like
import pandas as pd
data = pd.read_excel('path to the file')
You can put some treatments that you like
data['D'] = 'You can put here the value that you want to'
After that you can just
data.to_excel('path you'll export to')