Home > Back-end >  creating a variable based on other factors using R
creating a variable based on other factors using R

Time:10-14

My data looks like this:

hh_id indl ind_salary hh_income
1 1 200
1 2 450
1 3 00
2 4 1232
2 5 423

Individuals with the same hh_id lives in the same household so they will have the same household income. And for that the variable hh_income equal the sum of the salary of all persons with the same hh_id;

so my data would look like:

hh_id indl ind_salary hh_income
1 1 200 650
1 2 450 650
1 3 00 650
2 4 1232 1655
2 5 423 1655

Any ideas please;

CodePudding user response:

You can use R base function ave to generate sum of ind_salary grouped by hh_id and get a vector of the same length of ind_salary

> df$hh_income <- ave(df$ind_salary, df$hh_id, FUN=sum)
> df
  hh_id indl ind_salary hh_income
1     1    1        200       650
2     1    2        450       650
3     1    3          0       650
4     2    4       1232      1655
5     2    5        423      1655

CodePudding user response:

Using dplyr:

data %>% group_by(hh_id) %>% mutate(hh_income = sum(ind_salary))
  • Related