I am struggling to create a new variable named "edu_category" to indicate whether each person experiences Female Hypergamy (wive's education level < husband's), Female Homogamy (wive's education level == husband's), or Female Hypogamy (wive's education level > husband's).
My data looks like this (Female == 1 indicates this person is female, 0 indicates male):
PersonID | Female | EducationLevel | SpouseID | SpouseEducation |
---|---|---|---|---|
101 | 1 | 3 | 102 | 4 |
102 | 0 | 4 | 101 | 3 |
103 | 1 | 2 | 104 | 2 |
104 | 0 | 2 | 103 | 2 |
105 | 0 | 5 | 106 | 6 |
106 | 1 | 6 | 105 | 5 |
I wish to create a new variable so that my data looks like this:
PersonID | Female | EducationLevel | SpouseID | SpouseEducation | edu_category |
---|---|---|---|---|---|
101 | 1 | 3 | 102 | 4 | FHypergamy |
102 | 0 | 4 | 101 | 3 | FHypergamy |
103 | 1 | 2 | 104 | 2 | FHomogamy |
104 | 0 | 2 | 103 | 2 | FHomogamy |
105 | 0 | 5 | 106 | 6 | FHypogamy |
106 | 1 | 6 | 105 | 5 | FHypogamy |
Here, let's look at person with ID "105", his (because female == 0) education level is 5, his spouse's (person 106's) education level is 6, so it's Female Hypogamy, wive's education > husband's (we assume by default everyone's spouse is of opposite sex).
Now let's look at person with ID "106", since she is person 105's spouse, we also fill the variable "edu_category" with the same "FHypogamy". So essentially, we are looking at every unit of couples.
What I tried:
df2 <- df1 %>%
mutate(edu_category = case_when((SpouseEducation > EducationLevel) | (Female == 1) ~ 'FemaleHypergamy',
(SpouseEducation == EducationLevel) | (Female == 1) ~ 'FemaleHomogamy',
(SpouseEducation < EducationLevel) | (Female == 1) ~ 'FemaleHypogamy',
(SpouseEducation > EducationLevel) | (Female == 0) ~ 'FemaleHypogamy',
(SpouseEducation == EducationLevel) | (Female == 0) ~ 'FemaleHomogamy',
(SpouseEducation < EducationLevel) | (Female == 0) ~ 'FemaleHypergamy'))
However, it's not giving my accurate results - the variable "edu_category" itself is successfully created, but the "FemaleHypergamy", "FemaleHomogamy", and "FemaleHypogamy" are not reflecting accurate situations.
What should I do? Thank you for the help!
CodePudding user response:
One way could be using the conditions and then fill
the created NA's:
library(dplyr)
library(tidyr)
df %>%
mutate(edu_category = case_when(Female == 0 & EducationLevel < SpouseEducation ~ "FHypogamy",
Female == 0 & EducationLevel == SpouseEducation ~ "Homogamy",
Female == 0 & EducationLevel > SpouseEducation ~ "Hypergamy",
TRUE ~ NA_character_)) %>%
fill(edu_category, .direction = "updown")
PersonID Female EducationLevel SpouseID SpouseEducation edu_category
1 101 1 3 102 4 Hypergamy
2 102 0 4 101 3 Hypergamy
3 103 1 2 104 2 Homogamy
4 104 0 2 103 2 Homogamy
5 105 0 5 106 6 FHypogamy
6 106 1 6 105 5 FHypogamy
CodePudding user response:
df2 <- df1 %>%
mutate(edu_category = case_when(
(SpouseEducation > EducationLevel & Female == 1) ~ 'FemaleHypergamy',
(SpouseEducation > EducationLevel & Female == 0) ~ 'FemaleHypogamy',
(SpouseEducation < EducationLevel & Female == 1) ~ 'FemaleHypogamy',
(SpouseEducation < EducationLevel & Female == 0) ~ 'FemaleHypergamy',
SpouseEducation == EducationLevel ~ 'FemaleHomogamy'))