Home > Enterprise >  Snowflake SQL Javascript possible?
Snowflake SQL Javascript possible?

Time:10-07

Say I have the following query:

Select 
 COLUMN as RAW_COLUMN
,REGEX_REPLACE(COLUMN,'REGEX_PATTERN_HERE','REPLACEMENT_STRING_HERE') as CLEAN_COLUMN
FROM TABLE

As you can see above I created a Javascript function that allows me to replace a column value using regex patterns. Below is the function:

CREATE OR REPLACE FUNCTION "REGEX_REPLACE"("SUBJECT" VARCHAR(16777216), "PATTERN" VARCHAR(16777216), "REPLACEMENT" VARCHAR(16777216))
RETURNS VARCHAR(16777216)
LANGUAGE JAVASCRIPT
AS '
  
    const p = SUBJECT;
    let regex = new RegExp(PATTERN, ''i'') 
    return p.replace(regex, REPLACEMENT);
  ';

I need to clean the data with several regex replacements. What I'm trying to do is create a dictionary that will hold the regex pattern and replacement string that will be used to loop through and execute the regex_replace() function. In other words, run REGEX_REPLACE() function inside the SQL query itself over and over again until it goes through the entire dictionary.

//Key:value rule for regex_pattern_dict={'THE PATTERN HERE':'WHAT YOU WANT TO REPLACE IT WITH'}

regex_pattern_dict ={
     '[0-9]{2}[\/\.][0-9]{2}':''                 -- remove dates i.e. "08/02"
    ,'[0-9]{3}[-][0-9]{3}[-][0-9]{4}':''         -- remove phone numbers i.e. "888-957-4675"
    ,'[ ][0-9]{11}': ''                          -- remove phone numbers i.e. " 18882467822"
     
};

In other words, in the SQL query itself I'm trying to create a loop with the Javascript REGEX_REPLACE function and dictionary (dictionary is also in javascript).

I guess what I'm trying to do is something like this:


Select 
 COLUMN as RAW_COLUMN
--,REGEX_REPLACE(COLUMN,'REGEX_PATTERN_HERE','REPLACEMENT_STRING_HERE') as CLEAN_COLUMN
,for (const [PATTERN, REPLACEMENT] of Object.entries(regex_pattern_dict)) {
  REGEX_REPLACE(COLUMN,PATTERN,REPLACEMENT);
} AS CLEAN_COLUMN
FROM TABLE




Any ideas how to approach this? Not sure how to put a javascript loop in SQL.


With the help of @David Garrison the below code was the solution:



CREATE or replace PROCEDURE TABLE_CLEAN()
RETURNS VARCHAR
LANGUAGE javascript
AS
$$

//Regex Cleaning Dictionary
const regex_pattern_dict ={

    "[0-9]{2}[\/\.][0-9]{2}":""                 // remove dates i.e. "08/02"
    ,"[0-9]{3}[-][0-9]{3}[-][0-9]{4}":""         // remove phone numbers i.e. "888-957-4675"
    ,"[ ][0-9]{11}": ""                          // remove phone numbers i.e. " 18882467822"
};

//Column to Clean
var RegexStr = "COLUMN";

//Key = Regex Pattern, Value = Replacement
for (const [key, value] of Object.entries(regex_pattern_dict)) {
   RegexStr = "REXP_REPLACE_ME("   RegexStr   ",'"   key   "','"   value   "')"

}


var rs = snowflake.execute( { sqlText:
`
  CREATE OR REPLACE VIEW  DATABASE.SCHEMA.TABLE AS
  SELECT 
    COLUMN
    ,`   RegexStr   ` as clean_column

  FROM  DATABASE.SCHEMA.TABLE_ORIGINAL
`

                            } );

  $$;


  call TABLE_CLEAN();



select * from  DATABASE.SCHEMA.TABLE




CodePudding user response:

updated with more correct syntax. Full solution is above in the edited question

var RegexStr = "COLUMN";

for (const [key, value] of Object.entries(regex_pattern_dict)) {
   RegexStr = "REXP_REPLACE_ME("   RegexStr   ",'"   key   "','"   value   "')"

}


sql = `select 
           raw_column,
           `   RegexStr   `as clean_column
       from table`

sql_statement = snowflake.createStatement({sqlText: sql });
result_set = sql_statement.execute();

Note: This may require using a procedure instead of a function.

  • Related