this is my first time posting so hopefully I get all the information needed across correctly.
I'm attempting to do an update replace statement on a row where there is a set of characters but a wildcard statement won't work, below is a select statement example
SELECT replace(column, '[aA][bB][cC]-', '')
FROM table
where column like '%[aA][bB][cC]-%'
an example row would be "abc-123 aBc-123 abC-abc Def-123", but I need it to return "123 123 abc"
this is a basic example, but I'm trying to get rid of a set of 3 characters and a "-" character anywhere in a string. the abc could change to def but the "-" character will always come after.
I've done some googling and can't find an appropriate solution as most examples will only remove one example of abc- where I need to get rid of all versions. I'm running version 12.0 of sql server (SQL Server 2014) so I think some functions I wouldn't be able to use.
I think the closest example I could find was Using Wildcard For Range of Characters In T-SQL but I can't use the translate function.
Edit: below is an example of a created table
CREATE TABLE String_Removal_Example(
someValue VARCHAR(100)
)
INSERT INTO String_Removal_Example (someValue) VALUES ('abc-123')
INSERT INTO String_Removal_Example (someValue) VALUES ('abc-123 ABC-123')
INSERT INTO String_Removal_Example (someValue) VALUES ('abc-123 ABc-123 123-ABC DEF-123')
select statement brings back
someValue
abc-123
abc-123 ABC-123
abc-123 ABc-123 123-ABC DEF-123
Edit2: If this isn't possible to do in this manner, a possible alternative would be to remove 3 characters and the - character. I've tried the below
select Substring (someValue, Charindex( '-', someValue ) 1, Len(someValue)) From String_Removal_Example
but this returns the below, which is only affecting the first instance of nnn-.
123
123 ABC-123
123 ABc-123 123-ABC DEF-123
Edit3: The string I need to replace is nnn- for clarification. I was trying for the [aA][bB][cC]- format in case I needed to change it. It will always be 3 characters followed by a "-" Character
CodePudding user response:
You can create a function like this:
(I named it ThreeLetters
, but what's in a name...)
CREATE FUNCTION [dbo].[ThreeLetters]
(
@p0 VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @p1 VARCHAR(20) = '%[A-Z][A-Z][A-Z]-%';
DECLARE @Result VARCHAR(MAX) = @p0;
DECLARE @pos INT = PATINDEX(@p1,@Result COLLATE SQL_Latin1_General_CP1_CS_AS);
DECLARE @pos2 INT ;
DECLARE @i INT =0;
WHILE @pos>0 and @i<10
BEGIN
SET @pos2 = CHARINDEX('-',@Result,@pos)-@pos;
SELECT @Result = SUBSTRING(@Result,1,@pos-1) SUBSTRING(@Result, @pos @pos2 1, len(@Result));
SET @pos = PATINDEX(@p1,@Result COLLATE SQL_Latin1_General_CP1_CS_AS);
SET @i = @i 1;
END
RETURN @Result
END
output with your sample data:
someValue | (No column name) |
---|---|
abc-123 | 123 |
abc-123 ABC-123 | 123 123 |
abc-123 ABc-123 123-ABC DEF-123 | 123 123 123-ABC 123 |
see DBFIDDLE
P.S. The check for @i
was introduced while debugging. It might not be needed, but was tooo lazy to remove it.
P.P.S I am removing ThreeLetters followed by a -
, that is why 123-ABC
remains unchanged.
CodePudding user response:
Here's an XML trick that'll also work in Sql Server 2014
(In Sql Server 2016 and beyond, the string_split function would be better.)
declare @String_Removal_Example table ( someValue varchar(100) not null ); insert into @String_Removal_Example (someValue) VALUES ('abc-123') , ('abc-123 ABC-123') , ('abc-123 ABC-123 123-Abc DEF-123'); select ca.value as someValue from @String_Removal_Example t cross apply ( select rtrim(( select node.n.value('.','varchar(30)') ' ' from ( select cast('<x><a>' replace(replace(t.someValue, ' ', '</n><a>'), '-', '</a><n>') '</n></x>' as xml) as x ) q cross apply x.nodes ('/x/n') AS node(n) for xml path('') )) as value ) ca where lower(t.someValue) like '%[a-z0-9]-[a-z0-9]%'; GO
| someValue | | :-------------- | | 123 | | 123 123 | | 123 123 Abc 123 |
db<>fiddle here
CodePudding user response:
Please try the following solution. It is a variation of @LukStorms solution.
It is based on XML and XQuery. Their data model is based on ordered sequences.
XML allows us to tokenize the input string.
It is looking for 'nnn-' strings and filters them out.
For example, 2nd row is converted to XML like this:
<root>
<r>abc-</r>
<r>123</r>
<r>ABC-</r>
<r>123</r>
</root>
SQL
-- DDL and sample data population, start
declare @String_Removal_Example table (
someValue varchar(100) not null
);
insert into @String_Removal_Example (someValue) VALUES
('abc-123')
, ('abc-123 ABC-123')
, ('abc-123 ABC-123 123-Abc DEF-123');
-- DDL and sample data population, end
SELECT someValue AS [before], [after]
FROM @String_Removal_Example
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA['
REPLACE(REPLACE(someValue, '-', '- '), SPACE(1), ']]></r><r><![CDATA[')
']]></r></root>' AS XML)
.query('data(/root/r[not(contains(text()[1],"-"))])')
.value('.','VARCHAR(255)')) AS t1(after);
Output
--------------------------------- -----------------
| before | after |
--------------------------------- -----------------
| abc-123 | 123 |
| abc-123 ABC-123 | 123 123 |
| abc-123 ABC-123 123-Abc DEF-123 | 123 123 Abc 123 |
--------------------------------- -----------------
CodePudding user response:
SELECT replaceAll(column, '[aA][bB][cC]-', '') FROM table where column like '%[aA][bB][cC]-%'
CodePudding user response:
This is basically dividing up the string where there are spaces, then looking for "words" that have a '-' in the 4th position, removing the prefix and then stitching it back together:
SELECT s.someValue,
STUFF((SELECT ' ' SUBSTRING(value, 5, LEN(value) - 4)
FROM STRING_SPLIT(someValue, ' ')
WHERE CHARINDEX('-', value) = 4
FOR XML PATH('')),1,1,'') a
FROM String_Removal_Example s
CodePudding user response:
As there is no regex replace in SQL Server, I think the only way you can reliably use a pattern replace is through an iterative process. One such way is with a recursive CTE:
WITH Data AS
( SELECT SomeValue
FROM (VALUES
('abc-123'),
('abc-123 ABC-123'),
('abc-123 aBc-123 abC-abc Def-123'),
('abc-123 ABCD-123'),
('abc-123 A$C-123')
) x (SomeValue)
), RegexReplace AS
( SELECT SomeValue, NewValue = CONVERT(VARCHAR(100), Data.SomeValue), Level = 1
FROM Data
UNION ALL
SELECT SomeValue, CONVERT(VARCHAR(100), STUFF(NewValue, PATINDEX('% [A-z][A-z][A-z]-%', CONCAT(' ', NewValue)), 4, '')), Level 1
FROM RegexReplace AS d
WHERE PATINDEX('% [A-z][A-z][A-z]-%', CONCAT(' ', NewValue)) > 0
), RankedData AS
( SELECT *, RowNumber = ROW_NUMBER() OVER(PARTITION BY rr.SomeValue ORDER BY rr.Level DESC)
FROM RegexReplace AS rr
)
SELECT rd.SomeValue, rd.NewValue
FROM RankedData AS rd
WHERE rd.RowNumber = 1;
Which gives:
SomeValue | New Value |
---|---|
abc-123 | 123 |
abc-123 ABC-123 | 123 123 |
abc-123 aBc-123 abC-abc Def-123 | 123 123 abc 123 |
abc-123 A$C-123 | 123 A$C-123 |
abc-123 ABCD-123 | 123 ABCD-123 |
This basically uses PATINDEX()
to find the first instance of the pattern, then STUFF()
to remove the 4 characters after the index found. This process then repeats until the PATINDEX()
fails to find a match.
Probably also of note is that when doing the PATINDEX()
I am adding a space to the start of the string, and including a space at the start of the pattern. This ensures that it is limited to only 3 characters followed by a hyphen, if it is more than 3 (e.g. ABCD-123) the entire string is left intact rather than leaving A123
as shown in the last row above. Adding the space to the start of the string being searched ensures that the pattern is still found even if it is at the start of the string.