I have the following data as a DATE and MIXED_NAME. The MIXED_NAME has records with unique values like "Unique-Value-123" and with typical values all of which are "Typical-Value". The raw data is below. I want them group by DATE, MIXED_NAME
, but in the case that in the same day I have "Typical-Value", "Unique-Value-123" and "Typical-Value", I need them to be split into groups by "Unique-Value-123" which plays like a checkpoint. So I want to get a group before and another group after.
I use Oracle DB. So far I came to the idea of adding rownum of the previous "Unique-Value-123" to all of the typicals to get a bunch of "Typical-Value-123" which conveniently will group by. But I can't find a function that in single action will perform case LAG(SUBSTR(MIXED_NAME, INSTR(MIXED_NAME, '_-_') 1)) over (order by DATE)
action for a current row based on a result of this case
for a previous row.
How to organize such grouping?
UPD: Replaced the image with tables
Raw data:
DATE | MIXED_NAME |
---|---|
December 14, 2013 at 6:34:00 PM | Typical-Value |
December 14, 2013 at 6:35:00 PM | Typical-Value |
December 15, 2013 at 9:00:00 PM | Typical-Value |
December 15, 2013 at 10:00:59 PM | Unique-Value-123 |
December 25, 2013 at 1:31:13 AM | Typical-Value |
December 28, 2013 at 1:40:42 AM | Typical-Value |
December 30, 2013 at 4:41:41 PM | Typical-Value |
December 31, 2013 at 1:28:05 PM | Typical-Value |
December 31, 2013 at 11:00:00 PM | Typical-Value |
December 31, 2013 at 11:59:59 PM | Unique-Value-122 |
January 1, 2014 at 1:00:47 AM | Typical-Value |
January 1, 2014 at 1:02:05 AM | Typical-Value |
January 1, 2014 at 1:02:05 AM | Typical-Value |
January 1, 2014 at 1:32:47 AM | Typical-Value |
Expected result ("Typical-Value"-s are grouped):
DATE | MIXED_NAME |
---|---|
December 14, 2013 at 6:34:00 PM | Typical-Value |
December 15, 2013 at 10:00:59 PM | Unique-Value-123 |
December 15, 2013 at 11:00:00 PM | Typical-Value |
December 31, 2013 at 11:59:59 PM | Unique-Value-122 |
January 1, 2014 at 1:00:47 AM | Typical-Value |
CodePudding user response:
Maybe there is a more clever way using analytical functions, but this solution is fairly simple as well.
It assigns a grouping number, by counting how many non-typical-value rows you have before the current row. And then it groups by that grouping number.
with
VW_GROUP_NUMBER as (
select
v.*,
( select count(*)
from YOURTABLE r
where r.MIXED_NAME <> 'Typical-Value'
and r."DATE" <= v."DATE"
) as GROUP_NUMBER
from
YOURTABLE v)
select
max(gn."DATE") as GROUPDATE,
gn.MIXED_NAME
from
VW_GROUP_NUMBER gn
group by
gn.GROUP_NUMBER,
gn.MIXED_NAME
order by
1;
CodePudding user response:
You can do this using the tabibitosan method, like so:
WITH your_table AS (SELECT to_date('14/12/2013 18:34:00', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('14/12/2013 18:35:00', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('15/12/2013 21:00:00', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('15/12/2013 22:00:59', 'dd/mm/yyyy hh24:mi:ss') dt, 'Unique-Value-123' mixed_name FROM dual UNION ALL
SELECT to_date('25/12/2013 01:31:13', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('28/12/2013 01:40:42', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('30/12/2013 16:41:41', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('31/12/2013 13:28:05', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('31/12/2013 23:00:00', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('31/12/2013 23:59:59', 'dd/mm/yyyy hh24:mi:ss') dt, 'Unique-Value-122' mixed_name FROM dual UNION ALL
SELECT to_date('01/01/2014 01:00:47', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('01/01/2014 01:02:05', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('01/01/2014 01:02:05', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('01/01/2014 01:32:47', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('02/01/2014 01:40:52', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual UNION ALL
SELECT to_date('02/01/2014 01:40:52', 'dd/mm/yyyy hh24:mi:ss') dt, 'Unique-Value-124' mixed_name FROM dual UNION ALL
SELECT to_date('02/01/2014 01:45:13', 'dd/mm/yyyy hh24:mi:ss') dt, 'Typical-Value' mixed_name FROM dual),
tabibitosan AS (SELECT dt,
mixed_name,
row_number() OVER (ORDER BY dt, CASE WHEN mixed_name = 'Typical-Value' THEN 1 ELSE 0 END) - row_number() OVER (PARTITION BY mixed_name ORDER BY dt, CASE WHEN mixed_name = 'Typical-Value' THEN 1 ELSE 0 END) grp
FROM your_table)
SELECT to_char(MIN(dt), 'fmMonth dd, yyyy "at" hh:mi:ss AM') dt,
mixed_name
FROM tabibitosan
GROUP BY mixed_name,
grp
ORDER BY dt;
DT MIXED_NAME
--------------------------------- ----------------
December 14, 2013 at 6:34:0 PM Typical-Value
December 15, 2013 at 10:0:59 PM Unique-Value-123
December 25, 2013 at 1:31:13 AM Typical-Value
December 31, 2013 at 11:59:59 PM Unique-Value-122
January 1, 2014 at 1:0:47 AM Typical-Value
January 2, 2014 at 1:40:52 AM Unique-Value-124
January 2, 2014 at 1:40:52 AM Typical-Value
Tabibitosan works by numbering the rows in the main ordered set and again in the ordered subsets, and then comparing the row numbers of each subset to the main set.
Rows in the subset that are consecutive will have the same difference, which is represented by the grp
column in the above query. When the difference changes, you know a new group has started, and then it's simply a matter of finding the minimum value to find the start value.
You'll note that I've taken into account the fact that rows can have the same date and time - if that happens, I've assumed that the Typical-Value
row starts a new group as opposed to being the end of the previous group. If that's not the case, then change the ELSE 0
to ELSE 2
in the ORDER BY
clause of the row_number()
analytic fuctions.