Home > database >  convert json to csv and then upload same file to s3 bucket using python
convert json to csv and then upload same file to s3 bucket using python

Time:03-08

I have some json then converted that to csv file, now that same file should save to s3 not to my local folder.

  logs = {
"testing1": "testing1_value", 
"testing2": "testing2_value",
"testing3": {"testing3a": "testing1_value3a"}, 
"testing4": {"testing4a": {"testing4a1": "testing_value4a1"}}
}
file_name = "testing_file.csv"
bucket_name = "testing_bucket"
file_to_save_in_path = "path_in_s3/testing_file.csv"
client = boto3.client("s3")


from fastapi.responses import StreamingResponse
stream = await create_csv_for_download(logs, file_name)
    
response = StreamingResponse(iter([stream]), media_type="text/csv")
response.headers["Content-Disposition"] = f"attachment; filename={file_name}"


client.put_object(Bucket=bucket_name, Key=file_to_save_in_path, Body=response)

client.upload_file(response, bucket_name, file_to_save_in_path)

now response comes like some thing => <starlette.responses.StreamingResponse object at 0x7fe084e75fd0>

how to save that response in s3 in proper csv file..

error when i use client.put_object: be like below

**

Parameter validation failed:
Invalid type for parameter Body, value: <starlette.responses.StreamingResponse object at 0x7fe084e75fd0>, type: <class 'starlette.responses.StreamingResponse'>, valid types: <class 'bytes'>, <class 'bytearray'>, file-like object

**

CodePudding user response:

The error message is fairly clear:

Invalid type for parameter Body, value: <starlette.responses.StreamingResponse object at 0x7fe084e75fd0>, 
  type: <class 'starlette.responses.StreamingResponse'>, 
  valid types: <class 'bytes'>, <class 'bytearray'>, file-like object

It's telling you that you can't pass a StreamingResponse object to put_object(), it must be a bytes array, or a file object. Assuming your create_csv_for_download() function returns a stream object, you should just read the bytes from it, and send that off to put_object().

Further, the HTTP headers that you're setting in StreamingResponse should be passed along to put_object() directly:

import boto3
import csv
import io

def create_csv_for_download(logs, filename):
    # Just a stub so this is a self-contained example
    ret = io.StringIO()
    cw = csv.writer(ret)
    for key, value in logs.items():
        cw.writerow([key, str(value)])
    return ret

logs = {
    "testing1": "testing1_value", 
    "testing2": "testing2_value",
    "testing3": {"testing3a": "testing1_value3a"}, 
    "testing4": {"testing4a": {"testing4a1": "testing_value4a1"}}
}

file_name = "testing_file.csv"
bucket_name = "testing_bucket"
file_to_save_in_path = "path_in_s3/testing_file.csv"
client = boto3.client("s3")

stream = create_csv_for_download(logs, file_name)

# Ready out the body from the stream returned
body = stream.read()
if isinstance(body, str):
    # If this stream returns a string, encode it to a byte array
    body = body.encode("utf-8")

client.put_object(
    Bucket=bucket_name, 
    Key=file_to_save_in_path, 
    Body=body, 
    ContentDisposition=f"attachment; filename={file_name}", 
    ContentType="test/csv",
    # Uncomment the following line if you want the link to be 
    # publicly  downloadable from S3 without credentials:
    # ACL="public-read",
)
  • Related