Home > Enterprise >  ways to check for invalid characters? oracle sql
ways to check for invalid characters? oracle sql

Time:09-01

Looking for ways to filter out special signs, letters etc. from studentID in oracle SQL and show those records as invalid.

What is the best way to filter out letters a-Z and other characters? (only leaving numbers)

SELECT replace(Translate(studentid,'a-Z<>!-\ =/&', '??'),'?','') as StudentID, 'Invalid Characters in Student ID' 
FROM students

CodePudding user response:

The simplest approach is to use regular expressions. For example

select studentid
from   student
where  regexp_like(studentid, '\D')
;

\D means non-digit character; if the studentid contains at least one such character, in any position, then it will appear in the output. Note that null will not be flagged out, assuming it may appear in the column; perhaps the column is primary key in which case it can't be null. But this would apply to other tables as well, where studentid may be null.

If you have a very large table, or if you must perform this check often, you may want a less simple, but better performing query. Then you would want to use standard string functions, like you were trying to. Something like this will work:

select studentid
from   student
where  translate(studentid, 'x0123456789', 'x') is not null
;

translate will translate x to itself, and all digits to null (that is, all digits will be removed). The x trick is needed because the last argument must not be null. If the translation doesn't remove all characters from the string, then the studentid will appear in the output, as required.

If you need to show exactly which characters are non-digits (although that should be obvious), you can add the result of translate to the select clause. Note though that if a student id has, for example, trailing spaces, that will not be evident either from looking at the student id or at the result of translate. You may want to add something like dump(studentid) to select; if you are not familiar with dump, you may want to read a bit about it - it is extremely useful in diagnosing such problems, and easy to learn.

Once you find and handle all the exceptions, you may want to add a constraint to the column, to require all student id's to consist entirely of digits. Then you won't have to put up with this kind of errors anymore.

CodePudding user response:

If you want to allow numbers only, column datatype should have been NUMBER, not VARCHAR2.

[EDIT] That's wrong, though - see @mathguy's comment about it, saying that there are situations where values do consist of digits only, but - due to leading zeros - you can't use the NUMBER datatype.


A simple option is to use regexp_like and return rows that contain anything but digits:

SQL> with students (studentid) as
  2    (select '12345' from dual union all
  3     select 'ABC12' from dual union all
  4     select '23x#2' from dual
  5    )
  6  select studentid
  7  from students
  8  where not regexp_like(studentid, '^\d $');

STUDE
-----
ABC12
23x#2

SQL>

CodePudding user response:

You could also use below solution taking advantage of translate function.

  select studentid
  from students
  WHERE translate(studentid, '`0123456789', '`') IS NOT NULL
  ;

demo

  • Related