Home > Net >  How to get the summery of values in a Table? Check the distribution of age and gender
How to get the summery of values in a Table? Check the distribution of age and gender

Time:10-19

I want to count what is the number of each age for each gender, basically to see the distribution of age and gender.

A sample of the data is like

state      poi_name       gender age
aichi   starbucks shop E    2     3
aichi   starbucks shop G    0     2
aichi   starbucks shop G    1     2
chiba   starbucks shop A    0     1
chiba   starbucks shop D    1     1
chiba   starbucks shop A    0     2
tokyo   starbucks shop B    2     1
tokyo   starbucks shop B    1     0
tokyo   starbucks shop C    2     3
tokyo   starbucks shop F    1     2
aichi   starbucks shop E    1     2

I could get the gender distribution but not the age range, how can I do this.

SELECT
state, 
poi_name, 
count(gender)  all_cnt,
 count(  gender = '0' or null) as Unknown,
 count(  gender = '1' or null) as Total_Male,
 count(  gender = '2' or null) as Total_Female,
 count('gender') OVER(PARTITION BY state) AS cnt_for_state
FROM `geo_data_working.hw0160_infoDemographic` 
GROUP BY  state,poi_name
ORDER BY state,poi_name
age range:
-1:13-17
 1:18-24
 2:25-34
 3:35-44
 4:45-
  0:unknown

Thanks in advance

add-on: I was asked about my expected results, good point. Unfortunately, I am not sure, this is my first SQL try. Maybe something like bellow or if you have a better advise I will appreciate it

1)

state poi_name gender ageRange0 ageRangeage1
Tokyo ShopA F 200 100
Tokyo ShopA M 100 150

2)

state poi_name F_ageRange0 F_ageRangeage1 M_ageRange0 M_ageRangeage1
Tokyo ShopA 200 100 30 9000

CodePudding user response:

You want to get counts per age group. Use COUNTIF to count conditionally:

SELECT
  state, 
  poi_name, 
  COUNT(gender) AS all_cnt,
  COUNTIF(gender = 0) AS unknown,
  COUNTIF(gender = 1) AS total_male,
  COUNTIF(gender = 2) AS total_female,
  SUM(COUNT(*)) OVER (PARTITION BY state) AS cnt_for_state,
  COUNTIF(age = -1) AS age_13_to_17,
  COUNTIF(age = 1) AS age_18_to_24,
  COUNTIF(age = 2) AS age_25_to_34,
  COUNTIF(age = 3) AS age_35_to_44,
  COUNTIF(age = 4) AS age_above_45,
  COUNTIF(age = 0) AS age_unknown
FROM `geo_data_working.hw0160_infoDemographic` 
GROUP BY state, poi_name
ORDER BY state, poi_name;

CodePudding user response:

I think the following query get your target of query result:

    select res.state,
       res.POI_NAME,
       res.gender,
       res.range,
       count(*) as cnt
from (select e.STATE,
             e.POI_NAME,
             (case
                  when e.GENDER = 1 then 'male'
                  else
                      'female'
                 end) as gender,
             (case
                  when age = 1 then 'age_13_to_17'
                  when age = 2 then 'age_25_to_34'
                  when age = 3 then 'age_35_to_44'
                  when age = 4 then 'age_above_45'
                  else 'age_unknown'
                 end) as range
      from d-da-project.geo_data_working.hw0160_infoDemographic e) res
group by res.state,
         res.POI_NAME,
         res.gender,
         res.range;
  • Related