Home > database >  R: How to swap values in a data frame with condition?
R: How to swap values in a data frame with condition?

Time:08-19

My dataset has 2 IDs respectively from a parent and a child but I don't know which is who. I have however their age This is the table I am working with:

ID1 ID2 sex1 sex2 age1 age2
1   8   9    1    2   44   11
2  17   7    1    1   56   76
3   1  44   NA   NA   16   55
4   3  13   NA   NA   NA   NA
5  55   6    2   NA   56   10
6   4  33    2   NA   45   9
7   2  66    1   NA   12   45
8  72  99   NA   NA   NA   NA
9  12  11    2    2   30   12

By using an if statement, I want to identify who's who according to their age. Here is the code I made but it is not working:

install.packages('seqinr')
library(seqinr)

for (i in 1:nrow(data)){
  if (data$age2[i]> data$age1[i]){
    swap(data$age1[i], data$age2[i])
  }
}

The error message:

Error in if (data$age2[i] > data$age1[i]) { : 
  missing value where TRUE/FALSE needed

I want to put the parents' age in age1 and the child's age in age2. Does someone has a better idea on how to do it?

CodePudding user response:

Welcome to SO!
You can manage it without any for loop, in case you only need to put the highest value in age1, and the lower value in age2, comparing by row the two columns:

# I've put age_* to compare results with data, to replace, use age* in df$age*
df$age_1 <- pmax(df$age1, df$age2)
df$age_2 <- pmin(df$age1, df$age2)

With result:

  ID1 ID2 sex1 sex2 age1 age2 age_1 age_2
1   8   9    1    2   44   11    44    11
2  17   7    1    1   56   76    76    56
3   1  44   NA   NA   16   55    55    16
4   3  13   NA   NA   NA   NA    NA    NA
5  55   6    2   NA   56   10    56    10
6   4  33    2   NA   45    9    45     9
7   2  66    1   NA   12   45    45    12
8  72  99   NA   NA   NA   NA    NA    NA
9  12  11    2    2   30   12    30    12

With data:

df <- read.table(text = 'ID1 ID2 sex1 sex2 age1 age2
1   8   9    1    2   44   11
2  17   7    1    1   56   76
3   1  44   NA   NA   16   55
4   3  13   NA   NA   NA   NA
5  55   6    2   NA   56   10
6   4  33    2   NA   45   9
7   2  66    1   NA   12   45
8  72  99   NA   NA   NA   NA
9  12  11    2    2   30   12', header = T)

CodePudding user response:

library(tidyverse)

df <- read_table(
  "ID1 ID2 sex1 sex2 age1 age2
   8   9    1    2   44   11
  17   7    1    1   56   76
   1  44   NA   NA   16   55
   3  13   NA   NA   NA   NA
  55   6    2   NA   56   10
   4  33    2   NA   45   9
   2  66    1   NA   12   45
  72  99   NA   NA   NA   NA
  12  11    2    2   30   12"
)

Method 1:

df %>%  
  transform(age1 = case_when(age1 > age2 ~ age1, 
                             TRUE ~ age2), 
            age2 = case_when(age2 > age1 ~ age2, 
                             TRUE ~ age1))

Method 2:

df %>%  
  transform(age1 = pmax(age1, age2), 
            age2 = pmin(age1, age2))

  ID1 ID2 sex1 sex2 age1 age2
1   8   9    1    2   44   11
2  17   7    1    1   76   56
3   1  44   NA   NA   55   16
4   3  13   NA   NA   NA   NA
5  55   6    2   NA   56   10
6   4  33    2   NA   45    9
7   2  66    1   NA   45   12
8  72  99   NA   NA   NA   NA
9  12  11    2    2   30   12
  •  Tags:  
  • r
  • Related