Home > database >  R Dplyr mutate new column by calculating from other columns with conditionally replaced values
R Dplyr mutate new column by calculating from other columns with conditionally replaced values

Time:07-08

Given the following data:

df <- data.frame(x1 = c(1,2,3,4,5),
                 x2 = c(5,4,3,2,1))

I would like to mutate a new column (x3) which is the sum of x1 and x2 columns. However, I would like to conditionally replace the values before calculating such that:

1 = 0    
2 = 1   
3 = 2   
4 = 3   
5 = 3  

This would result in the following mutated column:

x3 
3   
4   
4   
4   
3  

My solution so far has involved mutating new x1 and x2 columns containing the replacement values before summing those and then dropping the mutated x1 and x2 but this seems very inefficient.

CodePudding user response:

Another possible solution, in base R:

z <- c(0, 1:3, 3)

df$x3 <- z[df$x1]   z[df$x2]
df

#>   x1 x2 x3
#> 1  1  5  3
#> 2  2  4  4
#> 3  3  3  4
#> 4  4  2  4
#> 5  5  1  3

Yet another base R solution:

z <- c(0, 1:3, 3)

df$x3 <- rowSums(sapply(df, \(x) z[x]))
df

#>   x1 x2 x3
#> 1  1  5  3
#> 2  2  4  4
#> 3  3  3  4
#> 4  4  2  4
#> 5  5  1  3

Or with tidyverse:

library(tidyverse)

z <- c(0, 1:3, 3)

df %>%
  mutate(x3 = rowSums(map_dfc(., ~ z[.x])))

#>   x1 x2 x3
#> 1  1  5  3
#> 2  2  4  4
#> 3  3  3  4
#> 4  4  2  4
#> 5  5  1  3

CodePudding user response:

In base R, we can subtract 1 from the data, use pmin to restrict the value greater than 3 to 3 and get the rowSums

df$x3 <- rowSums(pmin(as.matrix(df-1), 3))

-output

> df
  x1 x2 x3
1  1  5  3
2  2  4  4
3  3  3  4
4  4  2  4
5  5  1  3

CodePudding user response:

Here is a dplyr solution:

library(dplyr)

df %>% 
  mutate(across(, ~ifelse(. > 3, 3,. - 1), .names = "new_{.col}")) %>% 
  mutate(x3 = rowSums(.[3:4]))
   x1 x2 new_x1 new_x2 x3
1  1  5      0      3  3
2  2  4      1      3  4
3  3  3      2      2  4
4  4  2      3      1  4
5  5  1      3      0  3
  • Related