I am working on pulling data from a dataset where two columns, old_value and new_value should NOT be equal. But I am getting false rows being fetched.
For Example:
old_value | new_value |
---|---|
ABC | ABC |
This row should not be fetched but it was there. And upon figuring out why, it showed below:
old_value | new_value | len(old_value) | len(new_value) |
---|---|---|---|
ABC | ABC | 4 | 3 |
I am not sure why it showed "4" for old_value. I did try replacing white space with blank, new line character, regexp_replace, but none worked.
Please suggest how to find and remove this invisible character. Thank you!
CodePudding user response:
you can use regular expresiions to remove non ascii charters
select regexp_replace(old_value, '[^[:ascii:]]', '')
the pattern you must adapt to your data