I have a table like this
userId story novel
1 a b
1 a b
1 a c
1 b c
1 b c
2 x x
2 x y
2 y y
3 m n
4 NULL NULL
How do I find the most story and novel count per user?
What I am looking for is the highest distinct count of story and novel for each user. So if a user has no story then story_count should be 0.
Desired output looks like this
userId story story_count novel novel_count
1 a 3 c 3
2 x 2 y 2
3 m 1 n 1
4 NULL 0 NULL 0
This is my faulty current attempt
SELECT userId, story, COUNT(story) as story_count, novel, COUNT(novel) as novel_count
FROM logs WHERE user = (SELECT DISTINCT(user)) GROUP BY story, novel;
CodePudding user response:
Based on Tim's answer,I provide an upgrade solution
create table `logs`(
userId int,
story varchar(10),
novel varchar(10)
);
insert into `logs`(userId,story,novel) values
(1,'a',' b'),
(1 ,'a','c'),
(1 ,'b','c'),
(2 ,'x','x'),
(2 ,'x','y'),
(2 ,'y','y');
SELECT t1.userId,t1.story,t1.story_count,t2.novel,t2.novel_count
FROM
(
SELECT userId, story, COUNT(*) AS story_count,
RANK() OVER (PARTITION BY userId ORDER BY COUNT(*) DESC) rn
FROM logs
GROUP BY userId, story
) as t1
join
(
SELECT userId, novel, COUNT(*) AS novel_count,
RANK() OVER (PARTITION BY userId ORDER BY COUNT(*) DESC) rn
FROM logs
GROUP BY userId, novel
)as t2
ON t1.userId = t2.userId and t1.rn = t2.rn
WHERE t1.rn =1
CodePudding user response:
You should really use a window function if this is possible because thus, you can keep your query much shorter and simpler. If this is not possible and you really need to do it without them, you can also create two subqueries for both the story data and the novel data according to your conditions and then join them. Something like this:
SELECT storydata.userId, storydata.story, storydata.counter AS story_count,
noveldata.novel, noveldata.counter AS novel_count
FROM
(SELECT DISTINCT l.userId, l.story, sub.counter
FROM logs l
JOIN
(SELECT userId, COALESCE(MAX(story_count),0) AS counter
FROM
(SELECT userId, story,
COUNT(story) as story_count
FROM logs
GROUP BY userId, story) sub
GROUP BY userId) sub
ON l.userId = sub.userId
GROUP BY l.userId, l.story
HAVING COUNT(l.userId) = sub.counter OR l.story IS NULL) AS storydata
JOIN
(SELECT noveldata.userId, noveldata.novel, noveldata.counter AS counter
FROM
(SELECT DISTINCT l.userId, l.novel, sub.counter
FROM logs l
JOIN
(SELECT userId, COALESCE(MAX(novel_count),0) AS counter
FROM
(SELECT userId, novel,
COUNT(novel) as novel_count
FROM logs
GROUP BY userId, novel) sub
GROUP BY userId) sub
ON l.userId = sub.userId
GROUP BY l.userId, l.novel
HAVING COUNT(l.userId) = sub.counter OR l.novel IS NULL) AS noveldata) AS noveldata
ON storydata.userId = noveldata.userId;
But as you can see, this will become very complicated although it will produce the correct outcome. See here: db<>fiddle
Therefore, I once again highly recommend to use a DB version that provides window functions.
CodePudding user response:
I hope this works:
select story.userid, story, story_count, novel, novel_count
from
(
select userid, story, count(*) story_count,
row_number() over (partition by userid order by count(*) desc) rown_num
from logs
where row_num=1
)story
left join
(
select userid, novel, count(*) novel_count,
row_number() over (partition by userid order by count(*) desc) rown_num
from logs
where row_num=1
)novel on novel.userid=story.userid