I need to calculate the previous wage of director before he joins a new company.
I have created a simple dataset for one director (in practice I have many observations of director_id). This director with ID = 1 manages 5 firms which he joined in different years (the variable called enter). If director joined firm number 2 in 2011, I need the average of the variable wage for all years before 2011 which he was managing. For the same director = 1, I need a different mean(wage) for firm number 3 which he joined in 2012 (which will include mean(wage) from previous 2 companies that he managed before entering company 3 in 2012).
Below is the data. I would really appreciate your help in coding this problem.
clear
input enter year wage director_id firm_id
2006 2006 6.4790964 1 1
2006 2010 6.4783854 1 1
2006 2011 6.4067149 1 1
2006 2012 6.3716507 1 1
2006 2013 6.2248578 1 1
2006 2014 6.0631728 1 1
2011 2011 5.0127039 1 2
2011 2012 4.9616795 1 2
2011 2013 4.9483747 1 2
2011 2014 5.2612371 1 2
2012 2012 4.5389338 1 3
2012 2013 4.4322848 1 3
2012 2014 4.3223209 1 3
2013 2013 4.336947 1 4
2013 2014 4.27459 1 4
2015 2015 -.60586482 1 5
2015 2016 .085194588 1 5
end
I just need to exclude from mean(wage) all values that happen after he enters, so really need to regard only years before he enters a new company.
CodePudding user response:
A recipe for what I think you seek is that the mean previous wage in other firms =
(SUM of previous wages in all firms MINUS sum of previous wages in this firm) / (COUNT of previous years in all firms MINUS count of previous years in this firm).
Your example is helpful but the wage
variable is too irregular to allow easy eyeball checks.
Consider this sequence, where rangestat
is from SSC.
clear
input enter year wage director_id firm_id
2006 2006 6.4790964 1 1
2006 2010 6.4783854 1 1
2006 2011 6.4067149 1 1
2006 2012 6.3716507 1 1
2006 2013 6.2248578 1 1
2006 2014 6.0631728 1 1
2011 2011 5.0127039 1 2
2011 2012 4.9616795 1 2
2011 2013 4.9483747 1 2
2011 2014 5.2612371 1 2
2012 2012 4.5389338 1 3
2012 2013 4.4322848 1 3
2012 2014 4.3223209 1 3
2013 2013 4.336947 1 4
2013 2014 4.27459 1 4
2015 2015 -.60586482 1 5
2015 2016 .085194588 1 5
end
sort year firm_id
replace wage = _n
rangestat (sum) SUM=wage (count) COUNT=wage, int(year . -1) by(director_id)
rangestat (sum) sum=wage (count) count=wage, int(year . -1) by(director_id firm_id)
replace sum = 0 if sum == .
replace count = 0 if count == .
gen wanted = (SUM - sum) / (COUNT - count)
list, sepby(year)
---------------------------------------------------------------------------------
| enter year wage direct~d firm_id SUM COUNT sum count wanted |
|---------------------------------------------------------------------------------|
1. | 2006 2006 1 1 1 . . 0 0 . |
|---------------------------------------------------------------------------------|
2. | 2006 2010 2 1 1 1 1 1 1 . |
|---------------------------------------------------------------------------------|
3. | 2006 2011 3 1 1 3 2 3 2 . |
4. | 2011 2011 4 1 2 3 2 0 0 1.5 |
|---------------------------------------------------------------------------------|
5. | 2006 2012 5 1 1 10 4 6 3 4 |
6. | 2011 2012 6 1 2 10 4 4 1 2 |
7. | 2012 2012 7 1 3 10 4 0 0 2.5 |
|---------------------------------------------------------------------------------|
8. | 2006 2013 8 1 1 28 7 11 4 5.666667 |
9. | 2011 2013 9 1 2 28 7 10 2 3.6 |
10. | 2012 2013 10 1 3 28 7 7 1 3.5 |
11. | 2013 2013 11 1 4 28 7 0 0 4 |
|---------------------------------------------------------------------------------|
12. | 2006 2014 12 1 1 66 11 19 5 7.833333 |
13. | 2011 2014 13 1 2 66 11 19 3 5.875 |
14. | 2012 2014 14 1 3 66 11 17 2 5.444445 |
15. | 2013 2014 15 1 4 66 11 11 1 5.5 |
|---------------------------------------------------------------------------------|
16. | 2015 2015 16 1 5 120 15 0 0 8 |
|---------------------------------------------------------------------------------|
17. | 2015 2016 17 1 5 136 16 16 1 8 |
---------------------------------------------------------------------------------