I am using python to send a query to Athena and get table DDL. I am using start_query_execution
and get_query_execution
functions in the awswrangler package.
import boto3
import awswrangler as wr
import time
import pandas as pd
boto3.setup_default_session(region_name="us-east-1")
sql="show create table 'table-name'"
query_exec_id = wr.athena.start_query_execution(sql=sql, database='database-name')
time.sleep(20)
res=wr.athena.get_query_execution(query_execution_id=query_exec_id)
The code above creates a dict object that stores query results in an s3 link. The link can be accessed by
res['ResultConfiguration']['OutputLocation']
. It's a text link: s3://.....txt
Can someone help me figure how to access the output in the link. I tried using readlines() but it seemes to error out.
Here is what I did
import urllib3
target_url = res['ResultConfiguration']['OutputLocation']
f = urllib3.urlopen(target_url)
for l in f.readlines():
print (l)
Or if someone can suggest an easier way to get table DDL in python.
CodePudding user response:
Keep in mind that the returned link will time out after a short while... and make sure your credentials allow you to get the data from the URL specified. If you drop the error message here we can help you better. –
Oh... "It's a text link: s3://.....txt" is not a standard URL. You cant read that with urllib3. You can use awswrangler to read the bucket. –
I think the form is wr.s3.read_fwf(...)