Home > other >  Cloudera / Impala / SQL: finding all rows with unique value in specific column
Cloudera / Impala / SQL: finding all rows with unique value in specific column

Time:07-03

Hopefully a simple question for some of you: I have a table adsb_table as as follows (apologiesstrong text for the formatting of the table):

  • callsign | time | speed|
  • A | 23421 | 431 |
  • A | 23422 | 426 |
  • A | 23423 | 459 |
  • B | 23424 | 521 |
  • B | 23425 | 601 |
  • B | 23426 | 401 |
  • C | 23427 | 454 |
  • C | 23428 | 499 |
  • C | 23429 | 621 |

I want the resulting output to be the first row for each unique value of callsign:

  • A 23421 431
  • B 23424 521
  • C 23427 454

I have tried the following without success:

SELECT callsign, time, speed FROM adsb_table WHERE speed>400 ORDER BY callsign GROUP by callsign

I don't know if the fact that I am using Impala makes the difference in the query. No output is generated - if I remove the "GROUP BY" clause all ordered records are listed....so I am using the GROUP BY incorrectly I guess. Help.

CodePudding user response:

If you always want the first row per callsign, you can use ROW_NUMBER()

WITH cte AS (
  SELECT 
   callsign, 
   time, 
   speed, 
   ROW_NUMBER() OVER (PARTITION BY callsign) AS row_no
  FROM adsb_table 
  WHERE speed > 400 
)
SELECT * 
FROM cte
WHERE row_no = 1
ORDER BY callsign
  • Related