I'm working on a regular expression in Talend inside a tReplace component
I'm moving data from Oracle to Redshift and I'm having issues with DDL length because some characters are not supported (I guess)
I have product names like
175/65 R14 Efficiency
XXX N° 5 H7DC
And they have to stay like this. But sometimes I have NBSP inside labels or even worse sometimes
I saw this list of punctuation online [!"#$%&'()* ,-./:;<=>?@[\]^_{|}~°]
and I need to add it to my already existent Regex "[^A-Za-z0-9]"
TLDR; Can someone help me writing a REGEX to replace everything in a column except [A-Za-z0-9] and the punctuation list above ? It must be able to be use in the following code (As I'm using Talend and it's java interpreted)
StringUtils.replaceAll(row1.label, "[^A-Za-z0-9]", "");
CodePudding user response:
I ended up finding the solution thanks to the help of the answers above.
I used :
[^\p{Alnum}\p{Punct}\s]