I have a string where multiple occurrences of a substrings that match a regular expression should be replaced with another string, but it cannot be a constant string.
Here is my input string:
select col1,col2,cast(col3/1000 AS timestamp) as col3_ts,col4,CAST(col5/1000 as TIMESTAMP) as col5_ts,cast(col7 as string) as col7_str from mytable
Expected output:
select col1,col2,from_unixtime(col3/1000) as col3_ts,col4,from_unixtime(col5/1000) as col5_ts,cast(col7 as string) as col7_str from mytable
My objective is only to replace cast to timestamp expression without touching other cast expressions.
Created a regular expression as below that successfully replaced two occurrences of cast(/1000 as timestamp) with the constant value "from_unixtime(abc/1000)"
re.sub(r'(cast|CAST)\s*\(\S \/1000 (as|AS) (timestamp|TIMESTAMP)\)', 'from_unixtime(abc/1000)',str)
Above function produces this result:
select col1,col2,from_unixtime(abc/1000) as col3_ts,col4,from_unixtime(abc/1000) as col5_ts,cast(col7 as string) as col7_str from mytable
Is there a way to ensure that "abc" is replaced by the right column name that is within CAST function occurrences ?
Thanks!
CodePudding user response:
You may try the following find and replace, in regex mode (case insensitive):
Find: \bcast\((\S ) as timestamp\)
Replace: from_unixtime($1)
Demo
CodePudding user response:
I used perl to illustrate:
my $sql = 'select col1,col2,cast(col3/1000 AS timestamp) as col3_ts,col4,CAST(col5/1000 as TIMESTAMP) as col5_ts,cast(col7 as string) as col7_str from mytable';
print "Original: \n", $sql, "\n";
$sql =~ s/(?:cast|CAST)(\s*\(\S )\/1000 (?:as|AS) (?:timestamp|TIMESTAMP)\)/from_unixtime($1\/1000)/g;
print "New: \n", $sql, "\n";
Essentially you need to use a capture group to get the part you care about (\s*(\S ). Perl assigns that to $1 and you use that in the replacement text. Your chosen language my differ.
perl tim.pl
Original:
select col1,col2,cast(col3/1000 AS timestamp) as col3_ts,col4,CAST(col5/1000 as TIMESTAMP) as col5_ts,cast(col7 as string) as col7_str from mytable
New:
select col1,col2,from_unixtime((col3/1000) as col3_ts,col4,from_unixtime((col5/1000) as col5_ts,cast(col7 as string) as col7_str from mytable
I may have an extra ( in there but you get the idea.
I also changed your groupings to non-capturing groupings (?:) as I wanted to ensure the replacement was $1. First time I have used them though so...