Home > Software design >  Alberti or Caesar cipher in SQL on VARCHAR field to mask number
Alberti or Caesar cipher in SQL on VARCHAR field to mask number

Time:05-06

I've used an Alberti cipher in SAS to mask numbers which works great on varying lengths. However, I'm having trouble doing the same task in Redshift SQL.

If I have a number all the same length, I could simply use a combination of substrings and to pick apart and Frankenstein a masked number. If my COL1 is 123456789, it would be something like:

substring(col1,9,1)||substring(col1,2,2)||substring(acct_num_full,3,1)||...

However, the column I'm trying to mask could be anywhere from 5-40 characters and so the example above would not work. I've tried lpad and rpad to standardize the lengths but it doesn't work for the shorter numbers.

Does anyone have any ideas or have done Alberti or Caesar ciphering in SQL on a varying character column? I would need an algorithm that could be reversed as well so I could decrypt.

CodePudding user response:

The translate() function can perform character substitutions - https://docs.aws.amazon.com/redshift/latest/dg/r_TRANSLATE.html

Just provide the caesar cypher as the chars_to_replace and chars_to_substitute strings.

===========================================

For example performing the caesar cypher example from wikipedia - https://en.wikipedia.org/wiki/Caesar_cipher

select 'THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG'::text as pt
  into foo;

select pt,
 translate(pt,
           'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
           'XYZABCDEFGHIJKLMNOPQRSTUVW') as ct;

CodePudding user response:

I was able to create a work-around using SAS which allows me to copy and paste the output into my SQL statement using the technique I described above with a bunch of substring statements:

/* create scaffold of column lengths */
data temp1; 
    do j=5 to 40;
        output;
    end;
run;

/* create SQL case text and randomize order of substring syntax */
data temp2; set temp1;
    do i=1 to j;
        str="substring(col1,"||strip(put(i,best.))||",1)";
        u = rand("Uniform");
        output;
    end;
run;

/* sort of column length of randomized order var */
proc sort data = temp2;
    by j u;
run;

/* put into wide format */
proc transpose data = temp2 out=wide;
    by j;
    var str;
run;

/* concatenate all SQL substring statements */
data wide1; set wide;
    format str $2000.;
    str="when length(col1)="||strip(put(j,best.))||" then "||catx("||", of col:);
    keep str;
run;

/* output as a text file and copy and paste into your SQL select staement between case ... end */
  • Related