Home > Software design >  SQL Server or Python adding a column to keep new insert as 0 meanwhile make existing table auto incr
SQL Server or Python adding a column to keep new insert as 0 meanwhile make existing table auto incr

Time:03-30

I've been searching many similar posts here but they seem not to meet my requirement to use SQL Server IDENTITY.

I have a dataframe extracted from another data source which might include duplicate Name.

Dataframe

Date Time Name Result
3/28/2022 7:00:00 abc 0.23
3/28/2022 7:00:00 abc 0.19
3/28/2022 7:05:00 def 0.36
3/28/2022 7:10:00 ghi 0.29

When I load to SQL Server, I need to create a Retest_Count column in order to indicate the test order. If the Name is duplicated and same timestamp, always set the newer one to be 0 and the older one to be 1 based on the 'original data order'. Like this:

SQL Table 1

Date Time Name Result Retest_Count
3/28/2022 7:00:00 abc 0.23 1
3/28/2022 7:00:00 abc 0.19 0
3/28/2022 7:05:00 def 0.36 0
3/28/2022 7:10:00 ghi 0.29 0

When I insert new data with same Name but different timestamp, always give 0 to the new data and the existing table need to get auto increment like this:

SQL Table 2

Date Time Name Result Retest_Count
3/28/2022 7:00:00 abc 0.23 2
3/28/2022 7:00:00 abc 0.19 1
3/29/2022 13:00:00 abc 0.18 0
3/28/2022 7:05:00 def 0.36 0
3/28/2022 7:10:00 ghi 0.29 1
3/29/2022 21:05:00 ghi 0.25 0

Is this possible to achieve by SQL Server script? Python solution is considered also.

CodePudding user response:

As I mentioned in the comments, this is far better suited for a VIEW. You will need to ensure you have some kind of column in your table that always increases (such as an IDENTITY) to achieve this though (as I note you have 2 tests that occur at the same time for the same Name, so there isn't a way to denote which is "first").

Then you can create a VIEW like this:

CREATE VIEW dbo.YourView_RetestCounts AS
    SELECT [Date],
           [Time],
           [Name],
           Result,
           ROW_NUMBER() OVER (PARTITION BY [Name] ORDER BY IdentityColumn DESC) - 1 AS Retest_Count
    FROM dbo.YourTable;

If, however, your goal is to filter to the latest row, as your comments suggest, then use a VIEW with a Get top 1 row of each group solution:

CREATE VIEW dbo.YourView_LatestTest AS
    WITH CTE AS(
        SELECT [Date],
               [Time],
               [Name],
               Result,
               ROW_NUMBER() OVER (PARTITION BY [Name] ORDER BY IdentityColumn DESC) AS Retest_Count
        FROM dbo.YourTable)
    SELECT [Date],
           [Time],
           [Name],
           Result
    FROM CTE
    WHERE Retest_Count = 1;
  • Related