this is my code.
%spark.pyspark
jdbc_write(spark, spark.sql("""
SELECT
Global_Order_Number__c
, Infozeile__c
FROM STAG.SF_CASE_TRANS
"""), JDBC_URLS['xyz_tera_utf8'], "DEV_STAG.SF_CASE", "abc", "1234")
I want to exclude every character in the Infozeile__c field which are not a-z, A-Z, 0-9.
Is their any function which is able to do this?
CodePudding user response:
Apply regexp_replace() to the column in your query:
regexp_replace(Infozeile__c, '[^a-zA-Z0-9]', '') as Infozeile__c
The regex [^a-zA-Z0-9]
is a negated character class, meaning any character not in the ranges given. The replacement is a blank, effectively deleting the matched character.
If you're expecting lots of characters to be replaced like this, it would be a bit more efficient to add a
, which means "one or more", so whole blocks of undesirable characters are removed at a time.
regexp_replace(Infozeile__c, '[^a-zA-Z0-9] ', '')