Home > OS >  How to check how many times some values are duplicated?
How to check how many times some values are duplicated?

Time:03-28

I have table like below:

city      | segment
------------------
London    | A
London    | B
New York  | A
Berlin    | B
Barcelona | C
Barcelona | H
Barcelona | E

Each city should have only one segment, but as you can see there is two cities (London and Barcelona) which have more than only one segment.

It is essential that in result table I need only these cities which have > 1 segmnet

As a result I need somethig like below:

  • city - city based on table above

  • no_segments - number of segments which have defined city based on table above

  • segments - segments of defined city based on table above

    city no_segments segments
    London 2 A
    B
    Barcelona 3 C
    H
    E

How can I do that in Oracle?

CodePudding user response:

;with cte as ( Select city, count(seg) as cntseg From table1 Group by city having count(seg) > 1 ) Select a.city, b.cntseg, a.seg From table1 as a join cte as b On a.city = b.city

CodePudding user response:

You can use COUNT(*) OVER ()(in order to get number of segments) and ROW_NUMBER()(in order to prepare the results those will be conditionally displayed) analytic functions such as

WITH t1 AS
(
 SELECT city, 
        segment,
        COUNT(*) OVER (PARTITION BY city) AS no_segments,
        ROW_NUMBER() OVER (PARTITION BY city ORDER BY segment) rn
   FROM t
)
SELECT DECODE(rn,1,city) AS city, 
       DECODE(rn,1,no_segments) AS no_segments, 
       segment
  FROM t1
 WHERE no_segments > 1      
 ORDER BY t1.city, segment      

Demo

CodePudding user response:

Another way to do this is:

SELECT NULLIF(CITY, PREV_CITY) AS CITY,
       SEGMENT
  FROM (SELECT CITY,
               LAG(CITY) OVER (ORDER BY CITY DESC) AS PREV_CITY,
               SEGMENT,
               COUNT(SEGMENT) OVER (PARTITION BY CITY) AS CITY_SEGMENT_COUNT
        FROM CITY_SEGMENTS)
  WHERE CITY_SEGMENT_COUNT > 1

Using LAG() to determine the "previous" CITY allows us to directly compare the CITY values, which in my mind is clearer that using ROW_NUMBER = 1.

db<>fiddle here

  • Related