Home > Software engineering >  Snowflake qualify query error 'Invalid data type [NUMBER(18,0)] for predicate [RANK()'
Snowflake qualify query error 'Invalid data type [NUMBER(18,0)] for predicate [RANK()'

Time:04-26

I'm trying to run a query in Snowflake that uses the rank() function along with qualify to try and take the best match between two columns (using editdistance) and I'm getting the following error:

Invalid data type [NUMBER(18,0)] for predicate [RANK() OVER (PARTITION BY key1 ORDER BY editdistance(name1, name2) ASC NULLS LAST)]

Query looks like this:

select
  key1,
  key2
from
  table
qualify
  rank() over ( 
    partition by key1 order by editdistance(name1, name2) asc
  )

I'm really not clear on what number the error is referring to. name1/name2 are both varchar, key1 is a varchar. Hoping someone might have a pointer for me.

Thanks, Jeff

CodePudding user response:

The QUALIFY works similar to WHERE,but it is evaluated at different point during query execution. QUALIFY requires a predicate, which should evaluate to boolean.

QUALIFY:

QUALIFY <predicate>

The expression:

rank() over (partition by key1 order by editdistance(name1, name2))

returns NUMBER.

Comparing the result to integer fixes the error:

rank() over (partition by key1 order by editdistance(name1, name2)) = 1 

It is worth noting that there may be tie in the data, i.e, same distance per key for more than one row, for that situation it is safer to use DENSE_RANK:

dense_rank() over (partition by key1 order by editdistance(name1, name2)) = 1 

CodePudding user response:

Dean identified the problem in his comment above so just moving it to an answer to close the loop on this:

You need to provide a value for the QUALIFY function so it should be something like: qualify rank() over ( partition by key1 order by editdistance(name1, name2) asc ) = 1

  • Related