Home > database >  MSSQL aggregate (count) one column but show string based on different column from last matching row
MSSQL aggregate (count) one column but show string based on different column from last matching row

Time:11-10

I have a table such as below. The code never changes for each type of item, but the description may have been entered inconsistently. There is no way to go back and fix old data due to regulatory requirements in our industry, but we have fixed our input form so users cannot enter incorrect descriptions moving forward. In other words, the most recent descriptions are the correct descriptions that should be used.

I want to count how many times each code appears, but I want it to be shown as the most recent description.

id code description other stuff
1 09G jacket blah blah
2 270 pants blah blah
3 13B t-shirt blah blah
4 09G coat blah blah
5 09G sweater blah blah
6 13B shirt blah blah

Here is my current code:

SELECT
  TOP 10 code, count(*)
FROM
  table
GROUP BY
  code
ORDER BY
  count(*) DESC

Which gives me:

code value
09G 3
13B 2
270 1

This is correct, but instead of the code, I would like it to show the most recently entered description - such as:

code value
sweater 3
shirt 2
pants 1

The following code will get me close, but it returns the string that is "last" in alphabetical order, not necessarily the last entered (aka, the matching description with the largest id):

SELECT
  MAX(description) as description, count(*)
FROM
  table
GROUP BY
  code
ORDER BY
  count(*) DESC

CodePudding user response:

Assuming "most recent" means "highest id":

;WITH cte AS 
(
  SELECT description, 
         value = COUNT(*) OVER (PARTITION BY code), 
         rn    = ROW_NUMBER() OVER (PARTITION BY code ORDER BY id DESC)
  FROM dbo.[table]
)
SELECT description, value
  FROM cte
  WHERE rn = 1
  ORDER BY value DESC;

Results:

description value
sweater 3
shirt 2
pants 1

CodePudding user response:

Here is an approach using 2 subqueries joined together, but I like the common table expression approach better.

DECLARE @Data AS TABLE
(
    [id]          INT         NOT NULL,
    [code]        CHAR(3)     NOT NULL,
    [description] VARCHAR(10) NOT NULL
);

INSERT INTO @Data
(
    [id],
    [code],
    [description]
)
VALUES
(1, '09G', 'jacket'),
(2, '270', 'pants'),
(3, '13B', 't-shirt'),
(4, '09G', 'coat'),
(5, '09G', 'sweater'),
(6, '13B', 'shirt');


SELECT [descriptions].[description], [counts].[count]
FROM
       (
           SELECT   [code], COUNT(*) AS [count]
           FROM     @Data AS [d]
           GROUP BY [code]
       ) AS [counts]
       INNER JOIN
       (
           SELECT [ordered].[code], [ordered].[description]
           FROM
                  (
                      SELECT [code],
                             [description],
                             ROW_NUMBER() OVER (PARTITION BY [code]
                                                ORDER BY [id] DESC
                                               ) AS [rn]
                      FROM   @Data
                  ) AS [ordered]
           WHERE  [ordered].[rn] = 1

       ) AS [descriptions]
           ON [descriptions].[code] = [counts].[code];


  • Related