Home > Net >  How to mock Athena query results values with Moto3 for a specific table?
How to mock Athena query results values with Moto3 for a specific table?

Time:02-03

I am using pytest and moto3 to test some code similar to this:

response = athena_client.start_query_execution(
        QueryString='SELECT * FROM xyz',
        QueryExecutionContext={'Database': myDb},
        ResultConfiguration={'OutputLocation': someLocation},
        WorkGroup=myWG
    )

execution_id = response['QueryExecutionId']

if response['QueryExecution']['Status']['State'] == 'SUCCEEDED':
    response = athena_client.get_query_results(
        QueryExecutionId=execution_id
    )

    results = response['ResultSet']['Rows']
    ...etc

In my test I need that the values from results = response['ResultSet']['Rows'] are controlled by the test. I am using some code like this:

backend = athena_backends[DEFAULT_ACCOUNT_ID]["us-east-1"]
    rows = [{"Data": [{"VarCharValue": "xyz"}]}, {"Data": [{"VarCharValue": ...}, etc]}]
    column_info = [
        {
            "CatalogName": "string",
            "SchemaName": "string",
            "TableName": "xyz",
            "Name": "string",
            "Label": "string",
            "Type": "string",
            "Precision": 123,
            "Scale": 123,
            "Nullable": "NOT_NULL",
            "CaseSensitive": True,
        }
    ]
    results = QueryResults(rows=rows, column_info=column_info)
    backend.query_results[NEEDED_QUERY_EXECUTION_ID] = results

but that is not working as I guess NEEDED_QUERY_EXECUTION_ID is not known before from the test. How can I control it?

UPDATE

Based on suggestion I tried to use:

results = QueryResults(rows=rows, column_info=column_info)
d = defaultdict(lambda: results.to_dict())
backend.query_results = d

to force a return of values, but it seems not working as from the moto3's models.AthenaBackend.get_query_results, I have this code:

    results = (
        self.query_results[exec_id]
        if exec_id in self.query_results
        else QueryResults(rows=[], column_info=[])
    )
    return results

which will fail as the if condition won't be satifsfied.

CodePudding user response:

Extending the solution of the defaultdict, you could create a custom dictionary that contains all execution_ids, and always returns the same object:

class QueryDict(dict):
    def __contains__(self, item):
        return True
     def __getitem__(self, item):
        rows = [{"Data": [{"VarCharValue": "xyz"}]}, {"Data": [{"VarCharValue": "..."}]}]
        column_info = [
            {
                "CatalogName": "string",
                "SchemaName": "string",
                "TableName": "xyz",
                "Name": "string",
                "Label": "string",
                "Type": "string",
                "Precision": 123,
                "Scale": 123,
                "Nullable": "NOT_NULL",
                "CaseSensitive": True,
            }
        ]
        return QueryResults(rows=rows, column_info=column_info)

backend = athena_backends[DEFAULT_ACCOUNT_ID]["us-east-1"]

backend.query_results = QueryDict()

CodePudding user response:

An alternative solution to using custom dictionaries would to be seed Moto.

Seeding Moto ensures that it will always generate the same 'random' identifiers, which means you always know what the value of NEEDED_QUERY_EXECUTION_ID is going to be.

backend = athena_backends[DEFAULT_ACCOUNT_ID]["us-east-1"]
rows = [{"Data": [{"VarCharValue": "xyz"}]}, {"Data": [{"VarCharValue": "..."}]}]
column_info = [...]
results = QueryResults(rows=rows, column_info=column_info)
backend.query_results["bdd640fb-0667-4ad1-9c80-317fa3b1799d"] = results

import requests
requests.post("http://motoapi.amazonaws.com/moto-api/seed?a=42")

# Test - the execution id will always be the same because we just seeded Moto

execution_id = athena_client.start_query_execution(...)

Documentation on seeding Moto can be found here: http://docs.getmoto.org/en/latest/docs/configuration/recorder/index.html#deterministic-identifiers (It only talks about seeding Moto in the context of recording/replaying requests, but the functionality can be used on it's own.)

  • Related