Home > OS >  Deleting a file from S3 does not delete it from Athena?
Deleting a file from S3 does not delete it from Athena?

Time:01-03

When I add a file to S3, run a query against Athena, Athena returns the expected result with the data from this file.

Now if I then delete that same file from S3 and run the same query, Athena still returns the same data even though the file is not in S3 anymore.

Is this the expected behaviour? I thought Athena calls out to S3 on every query, but I'm now starting to think there is some sort of caching going on?

Does anyone have any ideas? I can't find any information online about this.

Thanks for the help in advance!

CodePudding user response:

Athena (Hive)/Glue load partitions with a frequency. If you want to load latest result you need run

MSCK REPAIR TABLE table_name;

to refresh Athena caches.

CodePudding user response:

Thanks for the help guys.

I actually was looking at the wrong files in S3 and the files I thought were removed were still present. Once I deleted them from S3, the query against Athena returned the expected results immediately.

Thanks!

  • Related