I have a column that store user input text field from a frontend website. User can input any kind of text in it, but they will also put in a specific alphabets combination to represent a job type - for example 'dri'. As an example:
Row 1: P49384; Open vehicle bonnet-BO-dri 22/10
Row 2: P93818; Vehicle exhaust-BO 10/20
Row 3: P1933; battery dri-pu-103/2
Row 4: P3193; screwdriver-pu 423
Row 5: X939; seats bo
Row 6: P9381-vehicle-pu-bo dri
In this case, I will like to filter only rows that contain dri
. From the example, you can see the text can be in any order (user behaviour, they will key whatever they like without following any kind of format). But the constant is that for a particular job type, they will put in dri
.
I know that I can simply use LIKE
in SQL Server to get these rows. Unfortunately, row 4 is included inside when I use this operator. This is because screwdriver contains dri
.
Is there any way in SQL Server I can do to strictly only obtain rows that has dri
job type, while excluding words like screwdriver?
I tried to use PATINDEX
but it failed too - PATINDEX('%[d][r][i]%', column) > 0
Thanks in advance.
CodePudding user response:
Your data is the problem here. Unfortunately even for denormalised data it doesn't appear to have a reliable/defined format, making parsing your data in a language like T-SQL next to impossible. What problems are there? Based on the original sample data, at a glance the following problems exist:
- The first data value's delimiter isn't consistent. Rows 1-5 use a semicolon (
;
), but row 6 uses a hyphen (-
) - The last data value's delimiter isn't consistent. Row 1, 2 & 4 use a space (
-
). - Internal data doesn't use a consistent delimiter. For example:
- Row 1 has a the value
Open vehicle bonnet-BO-dri
, which appears to be the valuesOpen vehicle bonnet
,BO
anddri
; so the hyphen(-
) is the delimiter. - Row 5 has
seats bo
, which appears to be the valuesseats
andbo
, so uses a space ( - The fact that row 6 has
vehicle
as its own value (vehicle-pu-bo-dri
), however, implies thatOpen vehicle bonnet
andVehicle Exhaust
(on rows 1 and 2 respectively) could actually be the valuesOpen
,vehicle
, &bonnet
andVehicle
&Exhaust
respectively.
- Row 1 has a the value
Honestly, the solution is to fix your design. As such, your tables should likely look something like this:
CREATE TABLE dbo.Job (JobID varchar(6) CONSTRAINT PK_JobID PRIMARY KEY NONCLUSTERED, --NONCLUSTERED Because it's not always ascending
YourNumericalLikeValue varchar(5) NULL); --Obviously use a better name
CREATE TABLE dbo.JobTypeCompleted(JobTypeID int IDENTITY (1,1) CONSTRAINT PK_JobTypeID PRIMARY KEY CLUSTERED,
JobID varchar(6) NOT NULL CONSTRAINT FK_JobType_Job FOREIGN KEY REFERENCES dbo.Job (JobID),
JobType varchar(30) NOT NULL); --Must likely this'll actually be a foreign key to an actual job type table
GO
Then, for a couple of your rows, the data would be inserted like so:
INSERT INTO dbo.Job (JobID, YourNumericalLikeValue)
VALUES('P49384','22/10'),
('P9381',NULL);
GO
INSERT INTO dbo.JobTypeCompleted(JobID,JobType)
VALUES('P49384','Open vehicle bonnet'),
('P49384','BO'),
('P49384','dri'),
('P9381','vehicle'),
('P9381','pu'),
('P9381','bo'),
('P9381','dri');
Then you can easily get the jobs you want with a simple query:
SELECT J.JobID,
J.YourNumericalLikeValue
FROM dbo.Job J
WHERE EXISTS (SELECT 1
FROM dbo.JobTypeCompleted JTC
WHERE JTC.JobID = J.JobID
AND JTC.JobType = 'dri');
CodePudding user response:
You can apply like operator in your query as column_name like '%-dri'. It means find out records that end with "-dri"