Home > Software engineering >  label rows in top 20% sql
label rows in top 20% sql

Time:03-19

Assuming this sample data in a sql table:

people   income
Mary     100
Joe      60
Sam      70
Bob      85
Sarah    85

I am trying to create a column which will identify each row as within the top 20% of income. So, the expected output is:

people   income   performers
Mary     100      Y
Joe      60       N
Sam      70       N
Bob      85       Y
Sarah    85       Y

I have tried TOP (20) PERCENT and a CASE statement, no luck. Just tried even extracting the top 20% and kept getting errors as well.

In Pandas, I got this, no problem. In SQL, it's like underwater brain surgery using oven mitts.

CodePudding user response:

If you're looking to compare each record to the max value of income, you can use a CASE statement to compare the current record:

SELECT people, income,
  CASE 
    WHEN income >= (SELECT MAX(income) FROM Foo) * 0.8 
    THEN 'Y' ELSE 'N'
  END AS performers
FROM Foo;
people  income  performer
Mary    100     Y
Joe     60      N
Sam     70      N
Bob     85      Y
Sarah   85      Y

CodePudding user response:

You can use a case statement

select
people,
income,
case when income >= (select max(income) from Table) * .8 then 'Yes' else 'No' end as performer
from Table
  •  Tags:  
  • sql
  • Related