Home > other >  Looking to replace a string that matches a regular expression with another string that cannot be a c
Looking to replace a string that matches a regular expression with another string that cannot be a c

Time:01-11

I have a string where multiple occurrences of a substrings that match a regular expression should be replaced with another string, but it cannot be a constant string.

Here is my input string: select col1,col2,cast(col3/1000 AS timestamp) as col3_ts,col4,CAST(col5/1000 as TIMESTAMP) as col5_ts,cast(col7 as string) as col7_str from mytable

Expected output: select col1,col2,from_unixtime(col3/1000) as col3_ts,col4,from_unixtime(col5/1000) as col5_ts,cast(col7 as string) as col7_str from mytable

My objective is only to replace cast to timestamp expression without touching other cast expressions.

Created a regular expression as below that successfully replaced two occurrences of cast(/1000 as timestamp) with the constant value "from_unixtime(abc/1000)"

re.sub(r'(cast|CAST)\s*\(\S \/1000 (as|AS) (timestamp|TIMESTAMP)\)', 'from_unixtime(abc/1000)',str)

Above function produces this result: select col1,col2,from_unixtime(abc/1000) as col3_ts,col4,from_unixtime(abc/1000) as col5_ts,cast(col7 as string) as col7_str from mytable

Is there a way to ensure that "abc" is replaced by the right column name that is within CAST function occurrences ?

Thanks!

CodePudding user response:

You may try the following find and replace, in regex mode (case insensitive):

Find:    \bcast\((\S ) as timestamp\)
Replace: from_unixtime($1)

Demo

CodePudding user response:

I used perl to illustrate:

my $sql = 'select col1,col2,cast(col3/1000 AS timestamp) as col3_ts,col4,CAST(col5/1000 as TIMESTAMP) as col5_ts,cast(col7 as string) as col7_str from mytable';

print "Original: \n", $sql, "\n";

$sql =~ s/(?:cast|CAST)(\s*\(\S )\/1000 (?:as|AS) (?:timestamp|TIMESTAMP)\)/from_unixtime($1\/1000)/g;


print "New: \n", $sql, "\n";

Essentially you need to use a capture group to get the part you care about (\s*(\S ). Perl assigns that to $1 and you use that in the replacement text. Your chosen language my differ.

perl tim.pl

Original: 
select col1,col2,cast(col3/1000 AS timestamp) as col3_ts,col4,CAST(col5/1000 as TIMESTAMP) as col5_ts,cast(col7 as string) as col7_str from mytable
New: 
select col1,col2,from_unixtime((col3/1000) as col3_ts,col4,from_unixtime((col5/1000) as col5_ts,cast(col7 as string) as col7_str from mytable

I may have an extra ( in there but you get the idea.

I also changed your groupings to non-capturing groupings (?:) as I wanted to ensure the replacement was $1. First time I have used them though so...

  •  Tags:  
  • Related