Home > Enterprise >  Can't view Staged Parquet File in S3 from Snowflake
Can't view Staged Parquet File in S3 from Snowflake

Time:09-09

I'm working on moving some Parquet files in S3 over to Snowflake. The Storage Integration & External Stage were created just fine, and when I run the list @mystage command I can see the file that I want to check out in S3 so I know it exists & that my connection works.

When I go to do any kind of querying on the columns or table metadata, nothing is returned.

I'm following the basic documentation from this enter image description here

create stage manual_test_stage
  url = 's3://cs-snowflake-poc/manual-loads/'
  storage_integration = cs_snowflake_poc_s3;

create or replace file format parquet_format
  type = 'parquet';

select *
  from table(
    infer_schema(
      location=>'@manual_test_stage/campspot_production.Campsite/test.parquet',
      file_format=>'parquet_format'
      )
    );

-- neither query works
select *
  from table(
    infer_schema(
      location=>'@manual_test_stage',
      file_format=>'parquet_format'
      )
    );

-- end goal is to use this to just create the table for me by automatically reading the schema, but i can't do that when select array_agg returns an empty array.
create table mytable
  using template (
    select array_agg(object_construct(*))
      from table(
        infer_schema(
          location=>'@manual_test_stage/campspot_production.Campsite/test.parquet',
          file_format=>'my_parquet_format'
        )
      ));

Tried all sorts of things: I recreated the stage to not include a / at the end, I tried like 5 different file format options in case that was screwing it up, and I loaded the parquet into Python to make sure it wasn't some issue with the actual file.

If anyone can help show me what I'm missing to get this to work it'd be much appreciated !

CodePudding user response:

INFER_SCHEMA allows to define FILES parameter.

select *
  from table(
    infer_schema(
      location=>'@manual_test_stage/campspot_production.Campsite',
      file_format=>'parquet_format',
      files=>'test.parquet'
      )
    );

Additionally checking the direct access is recommeded to confirm it is not a permission issue:

SELECT t.$1 
FROM "@manual_test_stage/campspot_production.Campsite/test.parquet"
(file_format=>'parquet_format') AS t;
  • Related