I have build a streamlit app that takes PDF as an input. After everything is done I want to save/upload the initial pdf file to s3 bucket for check in the future.
st.markdown =('# Imdocker pull mysql/mysql-server:latestport your PDF file')
pdf = st.file_uploader(label='Drag the PDF file here. Limit 100MB')
if pdf is not None:
text = TextExtraction.extract_text(pdf)
the bluh bluh the script that does nothing with pdf.
In the end I have:
s3 = boto3.resource(
service_name='s3',
region_name='ams3',
aws_access_key_id='5LVOTUJBAAJ2IIMGVBJV',
aws_secret_access_key='4SwvDZyDCbcmxoup6BPLImYc4aSeWuGLKECRTdgIn0Y',
)
bucket_name = 'mirai-pdf-private-stage'
print(pdf)
print(type(pdf))
pdf.seek(0)
name = 'pdf_' str(id) '.pdf'
print(name)
s3.Bucket(bucket_name).upload_fileobj(pdf, 'pdf_storage', name)
get the error:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 563, in _run_script
exec(code, module.__dict__)
File "/app/main.py", line 124, in <module>
s3.Bucket(bucket_name).upload_fileobj(pdf, 'pdf_storage', name)
File "/usr/local/lib/python3.9/site-packages/boto3/s3/inject.py", line 678, in bucket_upload_fileobj
return self.meta.client.upload_fileobj(
File "/usr/local/lib/python3.9/site-packages/boto3/s3/inject.py", line 629, in upload_fileobj
future = manager.upload(
File "/usr/local/lib/python3.9/site-packages/s3transfer/manager.py", line 321, in upload
self._validate_all_known_args(extra_args, self.ALLOWED_UPLOAD_ARGS)
File "/usr/local/lib/python3.9/site-packages/s3transfer/manager.py", line 500, in _validate_all_known_args
raise ValueError(
ValueError: Invalid extra_args key 'p', must be one of: ACL, CacheControl, ChecksumAlgorithm, ContentDisposition, ContentEncoding, ContentLanguage, ContentType, ExpectedBucketOwner, Expires, GrantFullControl, GrantRead, GrantReadACP, GrantWriteACP, Metadata, ObjectLockLegalHoldStatus, ObjectLockMode, ObjectLockRetainUntilDate, RequestPayer, ServerSideEncryption, StorageClass, SSECustomerAlgorithm, SSECustomerKey, SSECustomerKeyMD5, SSEKMSKeyId, SSEKMSEncryptionContext, Tagging, WebsiteRedirectLocation
I couldn't find anything online that does similar or fixes my issue.
I tried changing:
s3.Bucket(bucket_name).upload_fileobj(pdf, 'pdf_storage', name)
to
s3.Bucket(bucket_name).upload_fileobj(pdf, 'pdf_storage', name, extra_arg=None)
and get unexpected argument error.
Thanks in advance!
CodePudding user response:
Looks like you posted this on the Streamlit forum and it was answered there. Sharing the response below.
If you use boto3.client instead of resource, and s3.upload_fileobj instead of s3.Bucket.upload_fileobj, it should work.
import boto3
import streamlit as st
pdf = st.file_uploader(label="Drag the PDF file here. Limit 100MB")
if pdf is not None:
s3 = boto3.client(
service_name="s3",
region_name="xxx",
aws_access_key_id="xxx",
aws_secret_access_key="xxx",
)
id = 123
bucket_name = "xxx"
print(pdf)
print(type(pdf))
pdf.seek(0)
name = "pdf_" str(id) ".pdf"
print(name)
s3.upload_fileobj(pdf, "pdf_storage", name)