I am exploring mlb baseball statcast data, which is at the pitch level. I want to summarize the pitches at the plate appearance level. The relevant columns:
game_pk | pitcher_id | batter_id | inning | pitch_number | pitch_type | plt_apprnc_pk (unique)
----------------------------------------------------------------------------------------------
492321 | 542881 | 518466 | 2 | 1 | FF | 4923215428815184662
492321 | 543475 | 571448 | 1 | 2 | SL | 4923215434755714481
492321 | 543475 | 571448 | 1 | 1 | FF | 4923215434755714481
492321 | 543475 | 571448 | 1 | 3 | FF | 4923215434755714481
I am working on a query only showing one iteration of each plate appearance (plt_apprnc_pk), followed by pitch_number columns (1st pitch, 2nd pitch, 3rd pitch, etc...) showing each pitch type for the respective pitch number. In essence I want to transpose the pitch_type by pitch_number.
plt_apprnc_pk (unique)| first_pitch | second_pitch | third_pitch
----------------------------------------------------------------
4923215434755714481 | SL | FF | FF
I tried doing this with Case statements (ie CASE WHEN pitch_number = 1 THEN pitch_type END). I keep getting multiple rows for each plt_apprnc_pk.
plt_apprnc_pk (unique)| first_pitch | second_pitch | third_pitch
------------------------------------------------------------------
4923215434755714481 | SL | NULL | NULL
4923215434755714481 | NULL | FF | NULL
4923215434755714481 | NULL | NULL | FF
What is the best way to transpose these strings based on pitch_number while keeping only one iteration of the unique key??
CodePudding user response:
I tried doing this with Case statements (ie CASE WHEN pitch_number = 1 THEN pitch_type END). I keep getting multiple rows for each plt_apprnc_pk.
The pattern here is called conditional aggregation:
SELECT plt_apprnc_pk
,MAX(CASE WHEN pitch_number = 1 THEN pitch_type END) AS first_pitch
,MAX(CASE WHEN pitch_number = 2 THEN pitch_type END) AS second_pitch
,MAX(CASE WHEN pitch_number = 3 THEN pitch_type END) AS thrid_pitch
-- ...
--,MAX(CASE WHEN pitch_number = n THEN pitch_type END) AS nth_pitch
FROM tab
GROUP BY plt_apprnc_pk;
Note: The maximum "pitch_number" has to be known in advance.
CodePudding user response:
You may use Like this in your query I think useful this:
ISNULL(max(case when pitch_number = 1 THEN pitch_type end),0) as Default_name