creating a variable based on other factors using R-CodePudding

My data looks like this:

hh_id	indl	ind_salary
1	1	200
1	2	450
1	3	00
2	4	1232
2	5	423

Individuals with the same hh_id lives in the same household so they will have the same household income. And for that the variable hh_income equal the sum of the salary of all persons with the same hh_id;

so my data would look like:

hh_id	indl	ind_salary	hh_income
1	1	200	650
1	2	450	650
1	3	00	650
2	4	1232	1655
2	5	423	1655

Any ideas please;

CodePudding user response：

You can use R base function ave to generate sum of ind_salary grouped by hh_id and get a vector of the same length of ind_salary

> df$hh_income <- ave(df$ind_salary, df$hh_id, FUN=sum)
> df
  hh_id indl ind_salary hh_income
1     1    1        200       650
2     1    2        450       650
3     1    3          0       650
4     2    4       1232      1655
5     2    5        423      1655

CodePudding user response：

Using dplyr:

data %>% group_by(hh_id) %>% mutate(hh_income = sum(ind_salary))