How do I select records into a single row?-CodePudding

I've tried writing my sql query to select multiple records on to one row but it isn't working the way I expected it to Currently my table looks something like this

person id	fruit
1	apple
1	orange
1	banana
2	apple
2	orange
3	apple

I've tried using CASE and GROUP BY but it just gave extra records and didn't display the way I wanted it to and is displaying like this

SELECT DISTINCT
F.MEMBER
,F.GIVEN_NAMES
,F.SURNAME
--VALUES NEEDED
,CASE WHEN F.VALUE_NEEDED = 'Postal Address' THEN 'Yes' ELSE '' END POSTAL_ADDRESS
,CASE WHEN F.VALUE_NEEDED = 'Birthday' THEN 'Yes' ELSE '' END BIRTHDAY
,CASE WHEN F.VALUE_NEEDED = 'Email Address' THEN 'Yes' ELSE '' END EMAIL_ADDRESS
,CASE WHEN F.VALUE_NEEDED = 'First Name' THEN 'Yes' ELSE '' END FIRST_NAME
,CASE WHEN F.VALUE_NEEDED = 'Surname' THEN 'Yes' ELSE '' END SURNAME
,CASE WHEN F.VALUE_NEEDED = 'Title and Gender' THEN 'Yes' ELSE '' END 'TITLE|GENDER'
,CASE WHEN F.VALUE_NEEDED = 'Mobile' THEN 'Yes' ELSE '' END MOBILE
,CASE WHEN F.VALUE_NEEDED = 'Beneficiary' THEN 'Yes' ELSE '' END BENEFICIARY
FROM #FINAL F
GROUP BY F.MEMBER,F.GIVEN_NAMES
,F.SURNAME,VALUE_NEEDED
ORDER BY F.MEMBER

person id	apple	orange	banana
1	yes
1		yes
1			yes

How do I write the query so it looks more like this?

person id	apple	orange	banana
1	yes	yes	yes
2	yes	yes
3	yes

CodePudding user response：

You are almost there, just needed to add the max and group by to aggregate it . This used to be a typical interview question back then. Some thing like this if I understood correctly

with t as 
( 
  select 1 as person_id, 'apple' fruit
  union
  select 1 ,'orange' 
  union
  select 1 ,'banana' 
  union
  select 2 ,'apple' 
  union
  select 2 ,'orange' 
  union
  select 3 ,'apple'
) 
, b as 
(
  select 
    person_id, 
    case when fruit= 'apple' then 'yes' else null end 'apple',
    case when fruit= 'orange' then 'yes' else null end 'orange',
    case when fruit= 'banana' then 'yes' else null end 'banana' 
  from t
)
select
  person_id,
  max(apple) apple,
  max(orange) orange,
  max(banana) banana 
from b 
group by 1;

person_id	apple	orange	banana
1	yes	yes	yes
2	yes	yes	NULL
3	yes	NULL	NULL

CodePudding user response：

You have tagged the tool that you are using (SQL Server Management Studio) which can be used with different DBMS. As Microsoft's SQL Server is the most typical used in this tool, I assume your are using that.

First let's look at your table. It seems a bit weird. It seems to be a kind of key-value table (aka EAV). Each row tells us for an attribute whether it is needed for a person. Now how to identify a person in the table? Is the column member a unique person ID? Probably not, because then, what would be given_names and surname be for in that table, that can change with every entry. Why would the same person with the ID 1234 be called John Smith when the value_needed is 'Birthday', but be called 'Anne Miller' when value_needed is 'Mobile'? That wouldn't make much sense. So maybe member is just a flag, whether a person is a member or not, and a person is uniquely identified by their given_names and surname. But then again, why would the same person John Smith be a member when value_needed is 'Birthday', but not a member when value_needed is 'Mobile'? So something is amiss here. It seems your table is not normalized. Better have one person table and one table for the attributes.

That being said, GROUP BY ___ means "I want one result row per ___". You group by the person and their value_needed. But you don't want one result row per person and value_needed. You want one resut row per person. Hence, group by person.

Then you SELECT DISTINCT .... This means you want to remove duplicate rows. But look at the rows you are selecting. There are no duplicates. If you use GROUP BY, you can be 99.99% sure you don't need DISTINCT. (There do exist rare situations where you voluntarily group by columns, don't select all of them and then apply DISTINCT, but these are so rare that you probably won't ever use them at all.)

Now to the task: You want to get from rows to columns. This is called pivot and can be achieved with the PIVOT keyword, but it is more common to use conditional aggregation. "Conditional aggregation" means that you aggregate your data (per person) and then apply a condition. In standard SQL:

SELECT MIN('YES') FILTER (WHERE f.value_needed = 'Postal Address')

You can use MIN or MAX here, and it is only required for syntax reasons (the FILTER clause must refer to some aggregation function.

In SQL Server there is no FILTER clause, so you use a CASE expression instead:

SELECT MIN(CASE WHEN f.value_needed = 'Postal Address' THEN 'YES' END)

If you want the empty string '' instead of NULL, apply COALESCE:

SELECT COALESCE(MIN(CASE WHEN f.value_needed = 'Postal Address' THEN 'YES' END), '')

Columns aliases containing special characters like | require quoting. But not single quotes, as these denote string literals. In standard SQL use double quotes, in SQL Server use brackets. But better, just avoid them alltogether, by avoiding special characters in names.

The complete query:

SELECT
  person_id,
  MIN(CASE WHEN value_needed = 'Postal Address'   THEN 'yes' end) AS postal_address,
  MIN(CASE WHEN value_needed = 'Birthday'         THEN 'Yes' end) AS birthday,
  MIN(CASE WHEN value_needed = 'Email'            THEN 'Yes' END) AS email_address,
  MIN(CASE WHEN value_needed = 'First Name'       THEN 'Yes' END) AS first_name,
  MIN(CASE WHEN value_needed = 'Surname'          THEN 'Yes' END) AS surname,
  MIN(CASE WHEN value_needed = 'Title and Gender' THEN 'Yes' END) AS title_gender,
  MIN(CASE WHEN value_needed = 'Mobile'           THEN 'Yes' END) AS mobile,
  MIN(CASE WHEN value_needed = 'Beneficiary'      THEN 'Yes' END) AS beneficiary
FROM #FINAL
GROUP BY person_id
ORDER BY person_id;