Home > Blockchain >  Group Consecutively and sort using date in SQL
Group Consecutively and sort using date in SQL

Time:08-12

I am trying to sort this data using SQL Query

  1. Sort the data using start_date descending first - this is to keep latest transactions first.
  2. Sort the data based on COL_A, COL_B, COL_C such that COL_A, COL_B and COL_C data are grouped consecutively for any given set of records.

I tried using ROW_NUMBER and DENSE_RANK functions - but unable to derive. Could any one please help me on this?

Table data:

COL_A COL_B COL_C TASK START_DATE STATUS
REFERENCE GOLD ETL Upload 2022-08-04 16:40:17.000 Completed
REFERENCE GOLD ETL Translate 2022-08-04 16:36:33.000 Completed
REFERENCE GOLD ETL Extract 2022-08-04 16:21:41.000 Completed
Hive BRONZE WORKFLOW Translate 2022-08-04 12:30:25.000 Failed
Hive DM ETL Extract 2022-08-04 12:23:55.000 Completed
Hive BRONZE WORKFLOW Extract 2022-08-04 12:15:44.000 Completed
Standalone CONS ETL Extract 2022-08-04 07:17:31.000 Failed
Moving Window AGG ETL Upload 2022-08-03 15:08:48.000 Completed
Moving Window AGG ETL Translate 2022-08-03 15:05:41.000 Completed
Moving Window AGG ETL Extract 2022-08-03 14:53:50.000 Completed
Moving Window ANLT ETL Upload 2022-08-03 14:31:17.000 Completed
Moving Window ANLT ETL Translate 2022-08-03 14:26:17.000 Completed
Moving Window ANLT ETL Extract 2022-08-03 14:17:50.000 Completed
Hive BRONZE BILL Translate 2022-08-03 13:46:19.000 Completed
Standalone CONS ETL Extract 2022-08-03 13:34:09.000 Failed

Expected Output:

COL_A COL_B COL_C TASK START_DATE STATUS
REFERENCE GOLD ETL Upload 2022-08-04 16:40:17.000 Completed
REFERENCE GOLD ETL Translate 2022-08-04 16:36:33.000 Completed
REFERENCE GOLD ETL Extract 2022-08-04 16:21:41.000 Completed
Hive BRONZE WORKFLOW Translate 2022-08-04 12:30:25.000 Failed
Hive BRONZE WORKFLOW Extract 2022-08-04 12:15:44.000 Completed
Hive DM ETL Extract 2022-08-04 12:23:55.000 Completed
Standalone CONS ETL Extract 2022-08-04 07:17:31.000 Failed
Moving Window AGG ETL Upload 2022-08-03 15:08:48.000 Completed
Moving Window AGG ETL Translate 2022-08-03 15:05:41.000 Completed
Moving Window AGG ETL Extract 2022-08-03 14:53:50.000 Completed
Moving Window ANLT ETL Upload 2022-08-03 14:31:17.000 Completed
Moving Window ANLT ETL Translate 2022-08-03 14:26:17.000 Completed
Moving Window ANLT ETL Extract 2022-08-03 14:17:50.000 Completed
Hive BRONZE BILL Translate 2022-08-03 13:46:19.000 Completed
Standalone CONS ETL Extract 2022-08-03 13:34:09.000 Failed

CodePudding user response:

have you tried something along the lines of this SQL query?

SELECT COL_A, COL_B, COL_C, TASK, START_DATE, STATUS
FROM table
ORDER BY START_DATE DESC, COL_A, COL_B, COL_C;

If you want to also summarize somehow the results you could do something like this to e.g. count the number of unique instances of a given column.

-- for col_a
SELECT COUNT(DISTINCT(COL_A)) AS number, COL_A
FROM table
GROUP BY COL_A, COL_B, COL_C
ORDER BY START_DATE DESC;

-- for col_b
SELECT COUNT(DISTINCT(COL_B)) AS number, COL_B
FROM table
GROUP BY COL_A, COL_B, COL_C
ORDER BY START_DATE DESC;

HTH,

See also these posts: Using GROUP BY on multiple columns ORDER BY on multiple columns

CodePudding user response:

With sample data as above and with first order by START_DATE DESC any other ordering does not do anything because all the start date data are different so there is no need for additional orderings.
But in case there are some rows with exactly the same START_DATE then it is another story.
If you want to do such additional orderings using VarChar2 columns data then, i'm affraid, you will need a little help from some cte-s that will help you make that sorting work. That includes explicit definition of the orderings for every possible VarChar2 value in the collumns you want to order by. I don't know how many of those values you have in your table but it would look like below (for the data provided):

    order_a AS
        (
            Select  0 "ORD", 'REFERENCE'     "COL_X" From Dual Union All
            Select 10 "ORD", 'Hive'          "COL_X" From Dual Union All
            Select 20 "ORD", 'Standalone'    "COL_X" From Dual Union All
            Select 30 "ORD", 'Moving Window' "COL_X" From Dual 
        ),
    order_b AS
        (
            Select  0 "ORD", 'GOLD'     "COL_X" From Dual Union All
            Select 10 "ORD", 'SILVER'   "COL_X" From Dual Union All
            Select 20 "ORD", 'BRONZE'   "COL_X" From Dual Union All
            Select 30 "ORD", 'DM'       "COL_X" From Dual Union All
            Select 40 "ORD", 'CONS'     "COL_X" From Dual Union All
            Select 50 "ORD", 'AGG'      "COL_X" From Dual Union All
            Select 60 "ORD", 'ANLT'     "COL_X" From Dual 
        ),
    order_c AS
        (
            Select  0 "ORD", 'ETL'      "COL_X" From Dual Union All
            Select 10 "ORD", 'WORKFLOW' "COL_X" From Dual Union All
            Select 20 "ORD", 'BILL'     "COL_X" From Dual 
        )
SELECT t.COL_A, t.COL_B, t.COL_C, t.TASK,
    To_Char(t.START_DATE, 'yyyy-mm-dd hh24:mi:ss') "START_DATE",
    t.STATUS
FROM 
    tbl t
LEFT JOIN
    order_a a ON(a.COL_X = t.COL_A)
LEFT JOIN
    order_b b ON(b.COL_X = t.COL_B)
LEFT JOIN
    order_c c ON(c.COL_X = t.COL_C)
ORDER BY
    t.START_DATE DESC, Nvl(a.ORD, 999), Nvl(b.ORD, 999), Nvl(c.ORD, 999)

Once more, there is no need to do this if all START_DATE values are different. Regards...

  • Related