I am copying an entire snowflake DB into S3 to be viewed through Athena. I would like to preserve the schema/hierarchy so that the corresponding queries do not change. All the files are organized properly for this in S3 as follows
DataBase/Schema/Folder/Table/{parquet files}
When I crawl with Glue they all end up in one DB at the same level. Is it possible to have a similar folder structure in Athena?
Right now all queries in Athena are like
Select *
FROM database.table
I would like to have
Select *
FROM database.schema.folder.table
CodePudding user response:
The only logical grouping of tables available in Athena is a database
, and as you have indicated, there is no concept of hierarchy, schemas, or folders in Athena.
Database and schema comprise a namespace in Snowflake. If your intention is to simply have a similar namespace, what you can do is combine the Snowflake database d1
and schema name s1
to create a flattened logical grouping in Athena d1_s1
. Then you can do:
SELECT * FROM d1_s1.table
Also, the only special character you can have in the database name is an underscore, so there really is no other way to preserve the structure or the existing queries. At least, this way the format is close enough that it should be easy enough to programmatically fix the existing queries (e.g., using regex to replace a.b.c
with a_b.c
).
However, there will still be differences. For example, grants are managed differently for Snowflake databases and schemas. Schemas also have a concept of managed access
. This will not be possible in Athena.