My data looks like this:
hh_id | indl | ind_salary | hh_income |
---|---|---|---|
1 | 1 | 200 | |
1 | 2 | 450 | |
1 | 3 | 00 | |
2 | 4 | 1232 | |
2 | 5 | 423 |
Individuals with the same hh_id lives in the same household so they will have the same household income. And for that the variable hh_income equal the sum of the salary of all persons with the same hh_id;
so my data would look like:
hh_id | indl | ind_salary | hh_income |
---|---|---|---|
1 | 1 | 200 | 650 |
1 | 2 | 450 | 650 |
1 | 3 | 00 | 650 |
2 | 4 | 1232 | 1655 |
2 | 5 | 423 | 1655 |
Any ideas please;
CodePudding user response:
You can use R base function ave
to generate sum of ind_salary
grouped by hh_id
and get a vector of the same length of ind_salary
> df$hh_income <- ave(df$ind_salary, df$hh_id, FUN=sum)
> df
hh_id indl ind_salary hh_income
1 1 1 200 650
2 1 2 450 650
3 1 3 0 650
4 2 4 1232 1655
5 2 5 423 1655
CodePudding user response:
Using dplyr
:
data %>% group_by(hh_id) %>% mutate(hh_income = sum(ind_salary))