Hive-Standalone-metastore = v3.1.3
Hadoop jars = v3.3.4
I have setup Hive MetaStore with the eventual goal of connecting it with TRINO so I can query my parquet files in S3.. and I am in the trino
CLI now and can see my hive.<schema_name>
... and now want to create a simple table so I can query.. but getting an exception
trino:<MY_SCHEMA>> CREATE TABLE IF NOT EXISTS hive.<MY_SCHEMA>.<MY_TABLE> (
-> column_one VARCHAR,
-> column_two VARCHAR,
-> column_three VARCHAR,
-> column_four DOUBLE,
-> column_five VARCHAR,
-> column_six VARCHAR,
-> query_start_time TIMESTAMP)
-> WITH (
-> external_location = 's3a://<MY_S3_BUCKET_NAME>/dir_one/dir_two',
-> format = 'PARQUET'
-> );
CREATE TABLE
Query 20220924_181001_00019_bvs42 failed: Got exception: java.io.FileNotFoundException PUT 0-byte object on dir_one/dir_two: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: IDNUM123; S3 Extended Request ID: soMeLongID123=; Proxy: null), S3 Extended Request ID: soMeLongID123:404 Not Found
I indeed tested my AWS credentials manually.. I can indeed connect to the bucket and read the bucket.. I have files of type parquet
contained within the bucket...
what should I check or .. what could I be doing wrong? Thanks
EDIT: adding my hive.properties
connector.name=hive
hive.metastore.uri=thrift://$HIVE_IP_ADDR:9083
hive.s3.path-style-access=true
hive.s3.endpoint=$AWS_S3_ENDPOINT
hive.s3.aws-access-key=$AWS_ACCESS_ID
hive.s3.aws-secret-key=$AWS_SECRET
hive.s3.ssl.enabled=false
CodePudding user response:
I ended up deleting the endpoint entry altogether and it started to come to life..
Lesson is.. if you're following the tutorials for S3 integration .. most are NOT using S3 but some alternative like MinIO ... if you're using the real S3 you do NOT use the s3_endpoint
at all..
Do this instead
connector.name=hive
hive.metastore.uri=thrift://$HIVE_IP_ADDR:9083
hive.s3.path-style-access=true
hive.s3.aws-access-key=$AWS_ACCESS_ID
hive.s3.aws-secret-key=$AWS_SECRET
hive.s3.ssl.enabled=false