Every department completes an annual budget in Excel and submits the budget. The individual budgets get rolled up into a single master budget.
I've used file linking Excel formulas in the past but, this can be very time-consuming and prone to human error.
I think this is a perfect job for Python with Pandas (and/or other libraries).
Here is a picture sample data:
Here is what I have tried so far: (edited/cleaned-up a little from the original)
#import libraries
import pandas as pd
import glob
# import excel files
path = '*.xlsx'
files = glob.glob(path)
# loop thru
combined_files = pd.DataFrame()
for i in files:
df = pd.read_excel(i, index_col=None, skiprows=11,
nrows=147, usecols='D:P')
combined_files = combined_files.concat(df)
combined_files.to_excel('output4.xlsx', index=False)
If I run print(files) the files are listed
I've also played around with variations of the "combined_excels" variable but no joy.
Desired output:
A spreadsheet or .csv that has the general ledger description, (ie, "supplies"), in the first column, followed by combined amounts from all files under; Jan, Feb, Mar, etc.
So if dept1 budgets for $100 in supplies in January, dept2 budgets $200 in supplies in January, and dept3 budgets for $400 in supplies in January then the result will say: Supplies: Under January will be: $700.
I will have approximately 65 different Excel files and will need to iterate over the list. Most workbooks have multiple sheets. All of the worksheets have a sheet called, "Budget" and that is where we pulled from.
I removed all supporting sheets from my three sample files so I wouldn't have to deal with that aspect yet, but I will need to add that filter back soon.
I appreciate any help you can provide!
John
CodePudding user response:
Try this code ad your loop thru and concat:
combined_excels = pd.DataFrame()
for i in excels:
df = pd.read_excel(i, index_col=None, skiprows=11, nrows=147, usecols='D:P')
combined_excels = combined_excels.concat(df)
CodePudding user response:
Use the below function after you have read these excel files in pandas:
combined_excels = pd.concat((df1, df2), axis = 0)
if you want to concat it vertically.