Home > Back-end >  Having issues with my excel to csv conversion in aws lambda using the AWSDatawrangler layer
Having issues with my excel to csv conversion in aws lambda using the AWSDatawrangler layer

Time:08-27

I have a function that reads an excel file into a dataframe and then I save that dataframe in an s3 bucket using the awswrangler api to_csv function. The excel file has data starting from different rows and columns.

My conversion code looks something like this:

df = wr.s3.read_excel(
              path = 's3://bucket/filepath/', 
              sheet_name = 'sheetname', 
              header = 5, 
              index_col = False).iloc[:, 3:]

wr.s3.to_csv(df, path = "s3://bucket/filepath/', 
                 dataset = True, 
                 filename_prefix = 'sheetname')

The code works properly but I am having 2 issues:

  1. The csv outputs an additional index column even though I have specified index_col = False
  2. The file csv output name comes with additional serial numbers which I dont want. I just want something like sheetname.csv. I looked at the documentation and didn't find any good leads on how I can do that.

The current output looks something like sheetname8f1e8wefdf15f4wgfe5fef46we54f.csv

CodePudding user response:

You can specify the full path in the path parameter. You don't need to use filename_prefix at all. Using filename_prefix is why you are getting those extra characters in the filename.

Also, you can disable the index in the output by setting index = False.

wr.s3.to_csv(df, path = "s3://bucket/filepath/sheetname.csv", 
                 index = False)
  • Related