I need to truncate many columns to range from -3.0 to 3.0. This means: any values greater than 3.0, should be recoded as 3.0 into a new variable, and all values less than -3.0 should also be recoded into this new variable as -3.0.
Here is an example dataset
library(tidyverse)
MyData <- tibble( a = c(2.3, 3.0, -1.5, 3.7, -4.7, 5.2),
b = c(3.6, 1.52, -5.4, 4.6, 1.5, 2.2),
c = c(1.0, -2.6, -1.2, 2.5, -4.0, 3.0))
I found out how to do that creating a new variable for each old variable, using mutate()
and case_when()
however I have too many variables to do it manually, and I was wondering how I could do that in a shorter and more elegant way. I would like to see an output like the one originated from this manual code:
MyData %>%
mutate(Ta = case_when(a >= 3.0 ~ 3.0,
a <= -3.0 ~ -3.0,
T ~ a),
Tb = case_when(b >= 3.0 ~ 3.0,
b <= -3.0 ~ -3.0,
T ~ b),
Tc = case_when(c >= 3.0 ~ 3.0,
c <= -3.0 ~ -3.0,
T ~ c))
# A tibble: 6 x 6
a b c Ta Tb Tc
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2.3 3.6 1 2.3 3 1
2 3 1.52 -2.6 3 1.52 -2.6
3 -1.5 -5.4 -1.2 -1.5 -3 -1.2
4 3.7 4.6 2.5 3 3 2.5
5 -4.7 1.5 -4 -3 1.5 -3
6 5.2 2.2 3 3 2.2 3
CodePudding user response:
You might define a function and then apply it to many columns using across
.
pmin(3, pmax(x, -3))
is one way to constrain a vector (ie a column of a data frame) to the range -3 to 3. It takes the max of x and -3, and then takes the min of the result and 3.
The .names
parameter of across
lets us specify that the result of these operations should be additional columns named T [orig column name].
cap3 <- function(x) { pmin(3, pmax(x, -3)) }
MyData %>%
mutate(across(a:c, cap3, .names = "T{.col}"))
# mutate(across(1:3, cap3, .names = "T{.col}")) # Equiv. alternative
# mutate(across(everything(), cap3, .names = "T{.col}")) # Equiv. alternative
Result
# A tibble: 6 x 6
a b c Ta Tb Tc
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2.3 3.6 1 2.3 3 1
2 3 1.52 -2.6 3 1.52 -2.6
3 -1.5 -5.4 -1.2 -1.5 -3 -1.2
4 3.7 4.6 2.5 3 3 2.5
5 -4.7 1.5 -4 -3 1.5 -3
6 5.2 2.2 3 3 2.2 3
CodePudding user response:
Convert to matrix, take the pmin and pmax and append that to MyData:
MyData %>%
as.matrix %>%
pmin(3) %>%
pmax(-3) %>%
cbind(MyData, T = .)
giving:
a b c T.a T.b T.c
1 2.3 3.60 1.0 2.3 3.00 1.0
2 3.0 1.52 -2.6 3.0 1.52 -2.6
3 -1.5 -5.40 -1.2 -1.5 -3.00 -1.2
4 3.7 4.60 2.5 3.0 3.00 2.5
5 -4.7 1.50 -4.0 -3.0 1.50 -3.0
6 5.2 2.20 3.0 3.0 2.20 3.0
CodePudding user response:
Write the code that you want to apply to each column in a function and apply it with across
.
library(dplyr)
func <- function(a) {
case_when(a >= 3.0 ~ 3.0,
a <= -3.0 ~ -3.0,
T ~ a)
}
MyData %>%
mutate(across(.fns = func, .names = 'T{col}'))
# a b c Ta Tb Tc
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 2.3 3.6 1 2.3 3 1
#2 3 1.52 -2.6 3 1.52 -2.6
#3 -1.5 -5.4 -1.2 -1.5 -3 -1.2
#4 3.7 4.6 2.5 3 3 2.5
#5 -4.7 1.5 -4 -3 1.5 -3
#6 5.2 2.2 3 3 2.2 3