Home > Enterprise >  Unable to get the SQL query working with DISTINCT and windows function
Unable to get the SQL query working with DISTINCT and windows function

Time:09-01

I am not able to get this working because of the DISTINCT keyword

SELECT 
    ROW_NUMBER() OVER() AS tweet_bucket,
    DISTINCT COUNT(user_id) AS users_num
FROM
    tweets
WHERE 
    tweet_date >= '2022-01-01 00:00:00' 
GROUP BY 
    user_id

I know if I remove the windows function and just put it as:

SELECT DISTINCT 
    COUNT(user_id) AS users_num
FROM 
    tweets
WHERE 
    tweet_date >= '2022-01-01 00:00:00' 
GROUP BY 
    user_id

it works

I cannot remove the windows function as I need to get a row_id using ROW_NUMBER()

Realy struggling, is there anyway of using the same logic and have the distinct?

Here is my data:

CREATE TABLE tweets 
(
    tweet_id INT,
    user_id INT,
    msg VARCHAR(500),
    tweet_date DATETIME
);
  
DROP TABLE tweets;
  
INSERT INTO tweets 
VALUES (214252, 111, "Am considering taking Tesla private at $420. Funding secured", '2021-12-30 00:00:00');

INSERT INTO tweets 
VALUES (739252, 111, "Despite the constant negative press covfefe", '2022-01-01 00:00:00');

INSERT INTO tweets 
VALUES (846402, 111, "Following @NickSinghTech on Twitter changed my life!", '2022-02-14 00:00:00');

INSERT INTO tweets 
VALUES (241425, 254, "If the salary is so competitive why won’t you tell me what it is?", '2022-03-01 00:00:00');

INSERT INTO tweets 
VALUES (231574, 148, "I no longer have a manager. I can't be managed", '2022-03-01 00:00:00');
  
SELECT * FROM tweets;

SELECT DISTINCT 
    COUNT(user_id) AS users_num,
    ROW_NUMBER() OVER() AS tweet_bucket
FROM 
    tweets
WHERE 
    tweet_date >= '2022-01-01 00:00:00' 
GROUP BY 
    user_id

CodePudding user response:

I get the right answer with a CTE

WITH HistogramTweets AS   (   SELECT   DISTINCT COUNT(user_id) AS
users_num   FROM tweets   WHERE tweet_date >= '2022-01-01 00:00:00'  
GROUP BY user_id ) SELECT ROW_NUMBER() OVER() AS tweet_bucket,
users_num FROM HistogramTweets

CodePudding user response:

    SELECT DISTINCT 
COUNT(user_id) AS users_num
,ROW_NUMBER() OVER(ORDER BY user_id) AS tweet_bucket
FROM 
tweets
WHERE 
tweet_date >= to_date('2022-01-01','YYYY-MM-DD')
GROUP BY 
user_id
order by tweet_bucket

"you can use this in oracle pl sql so me be you should try this OVER(ORDER BY user_id) "

  • Related