Home > database >  Remove all Non-Numeric Chars from String
Remove all Non-Numeric Chars from String

Time:11-01

I would like to truncate all characters in a column, no matter where they are.

Example: " 49123/4567890(testnumber)"

Should be changed to "491234567890"

Is there a way without doing a replace for each char?

I have tried to replace it with several, but it is very time-consuming.

CodePudding user response:

As you mentioned, if you are expecting only [a-zA-z()/ ], you can use the translate function which is available from 2017

declare @table TABLE (str varchar(max))
insert into @table
select ' 49123/4567890(estnumber)'

select replace(translate(str, '/ ()abcdefghijklmnopqrstuvwxyz', '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'), '~', '') digits
from @table

For more complex scenarios where the characters are not known, you can try using recursive CTE on a string column to extract only digits like following query.

;with cte
as (
    select v.txt originalstring
        ,v.txt
        ,convert(varchar(max), '') as digits
        ,1 as lev
    from (
        values (' 49123/4567890(testnumber)')
            ,(' 234&*#$%!@#')
        ) v(txt)
    
    union all
    
    select originalstring
        ,stuff(txt, 1, 1, '')
        ,(
            case 
                when left(txt, 1) LIKE '[0-9]'
                    then digits   left(txt, 1)
                else digits
                end
            )
        ,lev   1
    from cte
    where txt > ''
    )
select originalstring
    ,digits
from (
    select c.originalstring
        ,c.digits
        ,row_number() over (partition by c.originalstring order by lev desc
            ) rn
    from cte c
    ) t
where rn = 1

Output

originalstring                    digits
---------------                   --------
 234&*#$%!@#                      234
 49123/4567890(testnumber)        491234567890

CodePudding user response:

A set-based option that exists in SQL Server 2017 is to utilise translate.

You can hopefully adapt the following to your specific use-case:

select col, Replace(Translate(col, r, Replicate('*', Len(r))), '*', '') Newcol
from t
cross apply(values(' ABCDEFGHIJKLMNOPQRSTUVWXYZ/\ ()'))r(r);

Example DB<>Fiddle

CodePudding user response:

Instead of hardcoding the list of "bad" characters you can use a double TRANSLATE to first get the unwanted characters and then plug that back into TRANSLATE.

DECLARE @table TABLE
  (
     str VARCHAR(max)
  )

INSERT INTO @table
SELECT ' 49123/4567890(testnumber)    '

DECLARE @CharactersToKeep VARCHAR(30) = '0123456789'

SELECT REPLACE(TRANSLATE(str, bad_chars, REPLICATE('X', LEN(bad_chars   'X') - 1)), 'X', '')
FROM   @table
       CROSS APPLY (SELECT REPLACE(TRANSLATE(str, @CharactersToKeep, REPLICATE(LEFT(@CharactersToKeep, 1), LEN(@CharactersToKeep))), LEFT(@CharactersToKeep, 1), '')) ca(bad_chars) 
  • Related