Home > Software engineering >  Spark persistent view on a partition parquet file
Spark persistent view on a partition parquet file

Time:07-09

In Spark, is it possible to create a persistent view on a partitioned parquet file in Azure BLOB? The view must be available when the cluster restarted, without having to re-create that view, hence it cannot be a temp view.

I can create a temp view, but not the persistent view. Following code returns an exception.

spark.sql("CREATE VIEW test USING parquet OPTIONS (path \"/mnt/folder/file.c000.snappy.parquet\")")

ParseException: mismatched input 'USING' expecting {'(', 'UP_TO_DATE', 'AS', 'COMMENT', 'PARTITIONED', 'TBLPROPERTIES'}(line 1, pos 23)

Big thank you for taking a look :)

CodePudding user response:

The syntax you used is working for temporary views but not persistent views. I faced the same error ParseException: mismatched input 'USING' expecting {'(', 'UP_TO_DATE', 'AS', 'COMMENT', 'PARTITIONED', 'TBLPROPERTIES'}, when I tried to reproduce this using a similar syntax.

  • Using the syntax provided for the CREATE VIEW in enter image description here

    • So, you can modify your query to the following statement:
    CREATE VIEW test as select * from parquet.`/mnt/folder/file.c000.snappy.parquet\`
    
  • Related