Home > OS >  split large xlsx to multiple csv file using python
split large xlsx to multiple csv file using python

Time:10-20

I am trying to generate multiple csv file by splitting single large xls file using pandas, but only generates a single small file, rest files are not getting generated. Please find the below sample function.

def xls_split_to_csv(sourceFilePath):
curr_date = datetime.now().strftime("%d-%m-%Y")
curr_time = datetime.now().strftime("%H-%M-%S")
try:
    logMessage("--------------------------------------------------------")
    logMessage(f'######### Split engine started - {curr_date}  ##########')
    logMessage("--------------------------------------------------------")
    chunk_size = config_params['split_size']
    batch = 0
    df = pd.read_excel(sourceFilePath)
    o_filename = f'file_{curr_date}_{curr_time}_{batch   1}.csv'
    file_count = math.ceil(len(df) / chunk_size)
    for chunk in np.array_split(df, file_count):
        logMessage(f'Splitting file ----> ({batch   1} of {file_count})')
        output_path = os.path.join('../TruncatedFile', o_filename)
        chunk.to_csv(output_path, index=False, header=True)
        batch  = 1

except ZeroDivisionError as zeroEx:
    logError("Exception: ")
except Exception as ex:
    logError("Exception: ")

CodePudding user response:

Move the line

o_filename = f'file_{curr_date}_{curr_time}_{batch   1}.csv'

into your for loop; at the moment you never change the output file name.

  • Related