Home > Blockchain >  Mismatched input 'x' expecting {<EOF>, ';'} when using GROUPING SETS
Mismatched input 'x' expecting {<EOF>, ';'} when using GROUPING SETS

Time:09-13

I am using Spark SQL to create grouping sets with total columns to reduce the size of a file which I have previously cubed.

My code is as follows

final_sum = sqlContext.sql('''
  SELECT case when GROUPING(x_flag)=1 then 'total' else x_flag END AS x_flag,
  ...
  from trans,
  group by x_flag,...,
  GROUPING SETS(
    (x_flag,y_name)
    ...
  )
''')

My column x_flag is indexed 0,1 (as integers) and I am trying to get grouping sets based on that column mainly.

I currently get the following:

mismatched input x_flag expecting {, ';'}(line 14, pos 11)

Is there a method to fix this without having to reflag the column as y and n or is there some error which I am missing with using integer in case when?

CodePudding user response:

I could not reproduce the error, but I have created a working example for you.

df = spark.createDataFrame(
    [(0, 0),
     (0, 1),
     (1, 0),
     (1, 1)],
    ['x_flag', 'y_name'])
df.createOrReplaceTempView("trans")

spark.sql("""
    SELECT
        CASE when GROUPING(x_flag)=1 then 'total' else x_flag END AS x_flag,
        CASE when GROUPING(y_name)=1 then 'total' else y_name END AS y_name,
        count(*)
    FROM trans
    GROUP BY
        GROUPING SETS(
            (x_flag, y_name),
            (x_flag),
            (y_name)
        )
""").show()
#  ------ ------ -------- 
# |x_flag|y_name|count(1)|
#  ------ ------ -------- 
# |     0|     0|       1|
# |     0| total|       2|
# | total|     0|       2|
# | total|     1|       2|
# |     0|     1|       1|
# |     1| total|       2|
# |     1|     0|       1|
# |     1|     1|       1|
#  ------ ------ -------- 

Make sure you use correct syntax: in your question, I have noticed a comma after the from clause.
Also, group by doesn't seem to require additional arguments when using grouping sets.

  • Related