Home > Software design >  SQL dealing with multilevel time series data
SQL dealing with multilevel time series data

Time:08-05

Question

I'm interested in finding out how many people switched jobs with their current employer in a given year. I know advanced SQL concepts but lack experience. So I'm interested in the way I should go about this. The main challenge for me is the fact that the data is multilevel (employers > persons > jobs) combined with the time interval constraints.

The data looks something like this:

| person | employer | job_title | started | stopped |
----------------------------------------------------- 
| p1     | e1       | j1        | t1      | t2      |
| p1     | e1       | j2        | t3      | t4      |
| p1     | e1       | j3        | t5      | t6      |
| p2     | e2       | j1        | t5      | t6      |
| p2     | e2       | j4        | t7      | t8      |
| p3     | e3       | j5        | t7      | t8      |
| p3     | e4       | j6        | t9      | t10     |

I need a SQL query that extracts the count of job switches within the same company (employer) during each calendar year.

  • t values are year-month combinations stored as integers (202208)
  • The target variable is "switching jobs within the same employer in a given year". This is defined as having the end date of job_a and the start date of job_b in the same calendar year x, with job_a and job_b of person_a both being at employer_a

Example responses

For a given time interval X.

  1. Let's say that t1 -> t4 fall within X, but t5 -> t10 do not. Then the output should be 1, as only p1 switched jobs before t5.
  2. If we say that t1 -> t10 all fall within X, the output should be 2, as both p1 and p2 switched jobs within the same company. p3 is excluded, as she switched jobs, but not within the same company.

CodePudding user response:

You may try the following:

Select Sum(d.chk) as Result
From 
(
  Select distinct person,
   Case When
    Count(*) Over (Partition By person,employer ) <>
    Count(*) Over (Partition By person,employer,job_title) 
   Then 1 Else 0
  End as chk
 From Emps
 Where started>='2015-10-10' and stopped <='2022-08-16'
) d

Count(*) Over (Partition By person, employer ) will count how many times the employee stayed in the same company whatever his position was.

Count(*) Over (Partition By person, employer, job_title) will count how many times the employee position changed within the same company.

If the two counts are not equal, means that the employee position has been changed within the same company.

See a demo from db<>fiddle.

  • Related