Home > OS >  Add values to a new column based on math calculations on three columns R
Add values to a new column based on math calculations on three columns R

Time:08-08

I have a data frame as the structure below:

head(test)

   geneA  geneB start end position
1  Ypc1 Malat1    34  59       36
2  Ypc1 Malat1    35  60       26
3  Ypc1 Malat1    34  59       60

I want to add a new column called as distance based on conditional math operations on the three columns which are start, end and position. I used the if statements as below but I constantly get 0 for the distance column. After if statements my output looks like this:

if (test$position < test$start) {
  test$distance <- test$start - test$position
} else if (test$position >= test$start & test$position <= test$end) {
  test$distance <- 0
} else if (test$position > test$end) {
  test$distance <- test$end - test$position
}

head(test)
   geneA  geneB start end position distance
1  Ypc1 Malat1    34  59       36        0
2  Ypc1 Malat1    35  60       26        0
3  Ypc1 Malat1    34  59       60        0

The desired output should be:

   geneA  geneB start end position distance
1  Ypc1 Malat1    34  59       36        0
2  Ypc1 Malat1    35  60       26        9
3  Ypc1 Malat1    34  59       60        -1

How can I do this?

Thank you in advance.

CodePudding user response:

When testing condition along a vector, you should use ifelse. I corrected your code below :

test <- data.frame(geneA = c("Ypc1"), geneB = c("Malat1"),
                   start = c(34, 35, 34),
                   end = c(59, 60, 59),
                   position = c(36, 26, 60))

test$distance <- ifelse(
    test$position < test$start,
    test$start - test$position, 
    ifelse(
        test$position > test$end,
        test$end - test$position,
        0
    ))
test
# geneA  geneB start end position distance
# 1  Ypc1 Malat1    34  59       36        0
# 2  Ypc1 Malat1    35  60       26        9
# 3  Ypc1 Malat1    34  59       60       -1

Your code won't work because the replace the full column distance on the first evaluation, which return 0.

However this is not very understable, I'll look for a shorter way to compute this !

  • Related