I am trying to generate a new variable to identify 'single parents' in a household, based on a group identifier. If there is a 'Child' in a group without both a 'Head' and "Spouse', I would like the variable to take the value of 1. I have tried using dplyr but am unable to arrive at the solution.
relation<-c("Head","Spouse","Child","Head","Spouse","Head","Child")
group<-c(1,1,1,2,2,3,3)
my_data<-as.data.frame(cbind(group,relation))
my_data %>%
group_by(group) %>%
mutate(single_parent = case_when(relation %in% "Child" & !(relation %in% "Head" & relation %in% "Spouse")~1))
# desired output
my_data$single_parent<-c(0,0,0,0,0,1,1)
Thank you for your help.
CodePudding user response:
We could do
library(dplyr)
my_data <- my_data %>%
group_by(group) %>%
mutate(single_parent = ((!all(c("Head", "Spouse") %in% relation &
'Child' %in% relation)) & 'Child' %in% relation)) %>%
ungroup
-output
my_data
# A tibble: 7 × 3
group relation single_parent
<dbl> <chr> <int>
1 1 Head 0
2 1 Spouse 0
3 1 Child 0
4 2 Head 0
5 2 Spouse 0
6 3 Head 1
7 3 Child 1
data
my_data <- data.frame(group, relation)
CodePudding user response:
Here is another tidyverse
option:
library(tidyverse)
my_data %>%
group_by(group) %>%
mutate(single_parent = ifelse(relation == "Child" & sum(n()) == 2, 1, NA)) %>%
fill(single_parent, .direction = "downup", 0) %>%
mutate(single_parent = replace_na(single_parent, 0))
data.frame(group = unique(my_data$group), single_parent = (table(my_data)[,1] == 1 & rowSums(table(my_data)[,-1]) == 1)) %>%
left_join(my_data, ., by = "group")
Or another option using a combination of base R and tidyverse
using table
:
data.frame(group = unique(my_data$group), single_parent = (table(my_data)[,1] == 1 & rowSums(table(my_data)[,-1]) == 1)) %>%
left_join(my_data, ., by = "group")
Output
group relation single_parent
<chr> <chr> <dbl>
1 1 Head 0
2 1 Spouse 0
3 1 Child 0
4 2 Head 0
5 2 Spouse 0
6 3 Head 1
7 3 Child 1