My dataset that I am using to create my SQL database has datapoints that are formatted as YYYY-MM, but as far as I have found there isn't a format for SQL in order to store that specifically. I am using Python and pyodbc to create a script that builds this database and updates it.
Is there any solution for this datatype within SQL that I am missing, or perhaps some way to solve this problem using the tools available in Pandas or something in Python?
CodePudding user response:
If you can't fix the python source to output a date instead of part of a date, we can handle this a different way.
Let's say your real table looks like this:
CREATE TABLE dbo.PythonInput
(
SomeKey int CONSTRAINT PK_pi PRIMARY KEY,
EventDate date
);
Create a staging table that mimics it, except that the date column is actually a string:
CREATE TABLE dbo.Staging_PythonInput
(
SomeKey int CONSTRAINT PK_st_pi PRIMARY KEY,
EventDate char(7)
);
Now you can have a trigger that just moves the data to your real table after massaging the date column:
CREATE TRIGGER dbo.CorrectBadPythonInput
ON dbo.Staging_PythonInput
INSTEAD OF INSERT
AS
INSERT dbo.PythonInput
(
SomeKey,
EventDate
)
SELECT
i.SomeKey,
TRY_CONVERT(date, i.EventDate '-01', 120)
FROM inserted AS i;
Then tell Python to insert into your staging table instead, e.g.:
INSERT dbo.Staging_PythonInput(SomeKey, EventDate)
VALUES(1, '2022-07'), (2, '2022-03'),
(3, '2019-05'), (4, '0123-67');
Now a query against the source table:
SELECT SomeKey, EventDate FROM dbo.PythonInput;
Yields:
SomeKey | EventDate |
---|---|
1 | 2022-07-01 |
2 | 2022-03-01 |
3 | 2019-05-01 |
4 | null |
- Example db<>fiddle
If you want to prevent the row with the invalid date from getting in, you can just add this to the trigger:
WHERE TRY_CONVERT(date, i.EventDate '-01', 120) IS NOT NULL;
(And then you could run a different insert into a logging table if you want to track rows that were rejected.)
CodePudding user response:
MS SQL Server supports XML data type, XQuery, XPath, and XSD.
XML has 49 data types. One of them is xs:gYearMonth
.
It holds data in the following format: yyyy-MM. Exactly what we need.
First, we would need to create a UDF function. Otherwise, we will get the following error:
Xml data type methods are not supported in check constraints. Create a scalar user-defined function to wrap the method invocation. The error occurred at table "".
After that we will use that UDF in a regular check constraint.
To run a negative test, just uncomment a commented line in the INSERT
statement.
SQL
USE tempdb;
GO
/*
DROP FUNCTION IF EXISTS dbo.CheckYearMonthXML;
GO
CREATE FUNCTION dbo.CheckYearMonthXML(
@yearMonth AS CHAR(7)
)
RETURNS BIT
AS
BEGIN
DECLARE @IsValid AS BIT
IF TRY_CAST('<r><![CDATA[' @yearMonth ']]></r>' AS XML)
.value('(/r/text())[1] cast as xs:gYearMonth?','CHAR(7)') IS NOT NULL
SET @IsValid = 1
ELSE
SET @IsValid = 0
RETURN @IsValid
END
GO
-- test
DECLARE @var CHAR(7) = '2022-12';
SELECT dbo.CheckYearMonthXML(@var);
*/
DROP TABLE IF EXISTS dbo.tbl;
GO
CREATE TABLE dbo.tbl (
ID INT IDENTITY PRIMARY KEY,
yearMonth CHAR(7),
CONSTRAINT CK_tbl_yearMonth CHECK (dbo.CheckYearMonthXML(yearMonth) = 1)
);
INSERT INTO dbo.tbl (yearMonth) VALUES
('2022-12'),
--('2022-22'),
('2022-08');
SELECT * FROM dbo.tbl;