Home > Blockchain >  Conditionally subtract cells in data frame
Conditionally subtract cells in data frame

Time:05-26

Suppose I have the following data frame

df1 <- data.frame(cbind("Method" = c("A", "A", "A", "A", 
                                    "B", "B", "B", "B", 
                                    "C", "C", "C", "C"),
                       "Sub" = c(rep(1:4, 2), c(1, 2, 4, 3)),
                       "Value1" = c(1, 2, 3, 4, 0, 0, 0, 0, 1, 2, 3, 4),
                       "Value2" = c(-1, -2, -3, -4, 0, 0, 0, 0, -1, -2, -3, -4),
                       "Value3" = 1:12))
    Method Sub Value1 Value2 Value3
1       A   1      1     -1      1
2       A   2      2     -2      2
3       A   3      3     -3      3
4       A   4      4     -4      4
5       B   1      0      0      5
6       B   2      0      0      6
7       B   3      0      0      7
8       B   4      0      0      8
9       C   1      1     -1      9
10      C   2      2     -2     10
11      C   4      3     -3     11
12      C   3      4     -4     12

I want to mutate Value1and Value2 by subtracting the values that are observed for Method == A. In this case, the desired output would be

  Method Sub Value1 Value2 Value3
1       A   1      0      0      1
2       A   2      0      0      2
3       A   3      0      0      3
4       A   4      0      0      4
5       B   1     -1      1      5
6       B   2     -2      2      6
7       B   3     -3      3      7
8       B   4     -4      4      8
9       C   1      0      0      9
10      C   2      0      0     10
11      C   4     -1      1     11
12      C   3      1     -1     12

Basically, it looks like subtracting df1[1:4, 3:4] from df1[5:8, 3:4] and df1[9:12, 3:4], with the exception that the rows have to be matched for Sub (see order of Sub in Method == C). Any help on how to efficiently achieve this?

CodePudding user response:

Using dplyr you can do

library(dplyr)
df1 %>% 
  group_by(Sub) %>% 
  mutate(across(Value1:Value2, ~.x-.x[Method=="A"]))

#    Method   Sub Value1 Value2 Value3
#    <chr>  <dbl>  <dbl>  <dbl>  <int>
#  1 A          1      0      0      1
#  2 A          2      0      0      2
#  3 A          3      0      0      3
#  4 A          4      0      0      4
#  5 B          1     -1      1      5
#  6 B          2     -2      2      6
#  7 B          3     -3      3      7
#  8 B          4     -4      4      8
#  9 C          1      0      0      9
# 10 C          2      0      0     10
# 11 C          4     -1      1     11
# 12 C          3      1     -1     12

This creates a group for each Sub, and then you can extract the value for the Method=="A" value in each Sub group.

CodePudding user response:

You have to use ifelse function. It works just like if function in Excel

library(dplyr)
df2 <- df1 %>%
  mutate(Value4 = ifelse(Method == "A", Value1 - Value2 , NA))

as a side note, you could build your DataFrame easier:

df1 <- data.frame(
  Method = c("A", "A", "A", "A", 
              "B", "B", "B", "B", 
              "C", "C", "C", "C"),
  Sub = c(rep(1:4, 2), c(1, 2, 4, 3)),
  Value1 = c(1, 2, 3, 4, 0, 0, 0, 0, 1, 2, 3, 4),
  Value2 = c(-1, -2, -3, -4, 0, 0, 0, 0, -1, -2, -3, -4),
  Value3 = c(1:12)
             )
  • Related