There is a similar question on here but for Mysql.
I need to create a new column in my table dbo.table
, that returns the most frequent fruit that a customer bought. P_id
and Fruit_bought
already exist, but I want the third column favourite_fruit
(as seen below):
p_id Fruit_bought Favourite_fruit
7740319 Apple Apple
976942 Pear Pear
976942 Pear Pear
7740319 Apple Apple
7740319 Pear Apple
976942 Banana Pear
5674352 Pear Pear
5674352 Pear Pear
5674352 Pear Pear
Both columns are VARCHAR. How do I go about this in sql? I'm a newbie but I'm presuming alter table?
CodePudding user response:
Assuming you just want to create a report about the favorite fruit per person you can use this query:
with cte as (
select p_id, fruit_bought, row_number() over (partition by p_id order by count(*) desc) as rn
from t
group by p_id, fruit_bought
)
select p_id, fruit_bought as favorite_fruit
from cte
where rn = 1
CodePudding user response:
Like I mentioned in the comments, I wouldn't do this as a computed column as you'll be forever chasing your tail. You say the data won't ever change, but that is never true; I've heard that excuse too many times, and weeks/months later I get the same thing "I know I said it wouldn't change, but it has, and now the data is wrong. How do you have these values update automatically?".
As such, use a VIEW
. Within the VIEW
, I use a windowed COUNT
to find out how many of each fruit was purchased by each person, and then ise FIRST_VALUE
to get the favourite fruit. In the event of a tie, the first fruit alphabetically is returned. Full repro below:
CREATE TABLE dbo.YourTable (p_id int,
Fruit_bought varchar(20));
GO
INSERT INTO dbo.YourTable (p_id, Fruit_bought)
VALUES (7740319,'Apple'),
(976942,'Pear'),
(976942,'Pear'),
(7740319,'Apple'),
(7740319,'Pear'),
(976942,'Banana'),
(5674352,'Pear'),
(5674352,'Pear'),
(5674352,'Pear');
GO
CREATE VIEW dbo.YourView AS
WITH Counts AS(
SELECT p_id,
Fruit_bought,
COUNT(Fruit_bought) OVER (PARTITION BY P_id, Fruit_bought) AS Times_bought
FROM dbo.YourTable)
SELECT p_id,
Fruit_bought,
FIRST_VALUE(Fruit_bought) OVER (PARTITION BY P_id ORDER BY Times_bought DESC, Fruit_Bought ASC) AS Favourite_fruit
FROM Counts;
GO
SELECT *
FROM dbo.YourView;
GO
DROP VIEW dbo.YourView;
DROP TABLE dbo.YourTable;