Home > database >  Fill in missing variables of family relationship matrix
Fill in missing variables of family relationship matrix

Time:05-07

I have a dataframe of family relationships (parent, child, spouse, etc.) which is partially filled as per example below. I am trying to use R to fill in the missing variables <NA>, but not sure where to begin. I've tried using ifelse() but the code becomes so unwieldy I'm sure there must be a more efficient way.

Example dataframe

   family person  R01    R02          R03         R04         R05         R06
1       A      1    X Spouse        Child      Parent      Parent      Parent
2       A      2 <NA>      X Child-in-law      Parent      Parent      Parent
3       A      3 <NA>   <NA>            X GrandParent GrandParent GrandParent
4       A      4 <NA>   <NA>         <NA>           X     Sibling     Sibling
5       A      5 <NA>   <NA>         <NA>        <NA>           X     Sibling
6       A      6 <NA>   <NA>         <NA>        <NA>        <NA>           X
7       B      1    X Spouse       Parent      Parent        <NA>        <NA>
8       B      2 <NA>      X       Parent      Parent        <NA>        <NA>
9       B      3 <NA>   <NA>            X     Sibling        <NA>        <NA>
10      B      4 <NA>   <NA>         <NA>           X        <NA>        <NA>
11      C      1    X Parent         <NA>        <NA>        <NA>        <NA>
12      C      2 <NA>      X         <NA>        <NA>        <NA>        <NA>

where R01 is the relationship of person x to person 1. For the second row of the dataframe above I would need R01 to be Spouse as that matches with R02 in the first row. The relationships would match as per the df below.

Relationship Matches

     [,1]            [,2]           
[1,] "Spouse"        "Spouse"       
[2,] "Parent"        "Child"        
[3,] "Child"         "Parent"       
[4,] "GrandParent"   "GrandChild"   
[5,] "GrandChild"    "GrandParent"  
[6,] "Parent-in-Law" "Child-in-law" 
[7,] "Child-in-Law"  "Parent-in-law"

Code to replicate Example

df1 <- data.frame(family = c(rep("A", 6), rep("B", 4), rep("C",2)),
                person = c(1:6, 1:4, 1:2),
                R01 = c("X", rep(NA,5),"X", rep(NA,3),"X",NA),
                R02 = c("Spouse", "X", rep(NA,4), "Spouse", "X", NA, NA, "Parent", "X"),
                R03 = c("Child", "Child-in-law", "X", NA, NA, NA, "Parent", "Parent", "X", rep(NA,3)),
                R04 = c(rep("Parent",2), "GrandParent", "X", NA, NA, rep("Parent",2), "Sibling", "X", NA, NA),
                R05 = c(rep("Parent",2), "GrandParent", "Sibling", "X", rep(NA,7)),
                R06 = c(rep("Parent",2), "GrandParent", rep("Sibling",2), "X", rep(NA,6)))

relationshipmatch <- matrix(c("Spouse", "Parent", "Child", "GrandParent", "GrandChild", "Parent-in-law", "Child-in-law", "Spouse", "Child", "Parent", "GrandChild", "GrandParent", "Child-in-law", "Parent-in-law"), ncol = 2)

CodePudding user response:

This solution works with character only. Since you have numeric (integer?) in reality, you may need to adapt the the [-indexing in the function.

I'm assuming that the frame is always ordered row-wise by person and column-wise incrementing R01:R06.

invert_relationships <- function(mat) {
  rel <- c(Spouse = "Spouse", Child = "Parent", Parent = "Child", GrandChild = "GrandParent",
           GrandParent = "GrandChild", "Child-in-law" = "Parent-in-law",
           "Parent-in-law" = "Child-in-law", Sibling = "Sibling", X = "X")
  mat0 <- as.matrix(mat)[,seq_len(nrow(mat))]
  mat0[] <- rel[match(as.matrix(mat0), names(rel))]
  mat1 <- as.data.frame(mat)[,seq_len(nrow(mat0))]
  mat1[lower.tri(mat1)] <- t(mat0)[lower.tri(mat0)]#mat0[upper.tri(mat0)]
  cbind(mat1, mat[,-seq_len(nrow(mat0))])
}

df1 %>%
  group_by(family) %>%
  mutate(invert_relationships(select(cur_data(), -person))) %>%
  ungroup()
# # A tibble: 12 x 8
#    family person R01    R02           R03          R04         R05         R06        
#    <chr>   <int> <chr>  <chr>         <chr>        <chr>       <chr>       <chr>      
#  1 A           1 X      Spouse        Child        Parent      Parent      Parent     
#  2 A           2 Spouse X             Child-in-law Parent      Parent      Parent     
#  3 A           3 Parent Parent-in-law X            GrandParent GrandParent GrandParent
#  4 A           4 Child  Child         GrandChild   X           Sibling     Sibling    
#  5 A           5 Child  Child         GrandChild   Sibling     X           Sibling    
#  6 A           6 Child  Child         GrandChild   Sibling     Sibling     X          
#  7 B           1 X      Spouse        Parent       Parent      NA          NA         
#  8 B           2 Spouse X             Parent       Parent      NA          NA         
#  9 B           3 Child  Child         X            Sibling     NA          NA         
# 10 B           4 Child  Child         Sibling      X           NA          NA         
# 11 C           1 X      Parent        NA           NA          NA          NA         
# 12 C           2 Child  X             NA           NA          NA          NA         

CodePudding user response:

You can make the relationship matrix symmetric in each family, and at the same time swap Child with Parent in those relationships containing them. Here stringr::str_replace_all is used to do swapping.

library(dplyr)

df1 %>%
  group_by(family) %>%
  group_modify(~ {
    mat <- as.matrix(select(.x, starts_with("R") & !where(~all(is.na(.x)))))
    mat[lower.tri(mat)] <- stringr::str_replace_all(
      t(mat)[lower.tri(mat)],
      c("Parent" = "Temp", "Child" = "Parent", "Temp" = "Child")
    )
    cbind(select(.x, !starts_with("R")), mat)
  }) %>%
  ungroup()

# A tibble: 12 × 8
   family person R01    R02           R03          R04         R05         R06        
   <chr>   <int> <chr>  <chr>         <chr>        <chr>       <chr>       <chr>      
 1 A           1 X      Spouse        Child        Parent      Parent      Parent     
 2 A           2 Spouse X             Child-in-law Parent      Parent      Parent     
 3 A           3 Parent Parent-in-law X            GrandParent GrandParent GrandParent
 4 A           4 Child  Child         GrandChild   X           Sibling     Sibling    
 5 A           5 Child  Child         GrandChild   Sibling     X           Sibling    
 6 A           6 Child  Child         GrandChild   Sibling     Sibling     X          
 7 B           1 X      Spouse        Parent       Parent      NA          NA         
 8 B           2 Spouse X             Parent       Parent      NA          NA         
 9 B           3 Child  Child         X            Sibling     NA          NA         
10 B           4 Child  Child         Sibling      X           NA          NA         
11 C           1 X      Parent        NA           NA          NA          NA         
12 C           2 Child  X             NA           NA          NA          NA             
  • Related