Home > Software engineering >  Function containing ifelse not working in dplyr, works fine outside
Function containing ifelse not working in dplyr, works fine outside

Time:02-10

I'd like to write a function with two inputs (x and y) to create some mutated variables in a very large dataframe. Specifically, if x=y then return x, and if x!=y then draw 1 sample from a sequence of x to y.

The function works fine when I test it outside of my datafarme, but throws an error when I try to use it within mutate. I've tried both ifelse and if_else versions.

library(dplyr)

smx <- function(x,y){ #Function to allow sampling if length>1
  if_else(x==y,x,sample(seq(x,y,1),1))} 
  #ifelse(x==y,x,sample(seq(x,y,1),1))} #Have also tried this with ifelse, doesn't work


smx(0,0) #This works
smx(0,5) #This works

#Create dummy data frame
df <- as.data.frame(cbind(c(rep(0,5)),c(seq(0,4,1))))
colnames(df) <- c("varA","varB")

df

#This doesn't work
df1 <- df %>% mutate(
  VarC = smx(varA,varB)
)

Ideally, my output should include a third column (VarC) in which the first row is equal to 0 (because varA=varB) and the remaining rows are a random sample between a sequence from varA to varB.

I have set up my data frame so that varA is always be smaller than varB, but I'm not certain. Appreciate any help on a clean solution to this problem!

CodePudding user response:

The function is not working because it is not vectorized. First, you'll need to vectorized your function, in order to make it work inside mutate.

You can do that as follows:

vectorized_fun <- Vectorize(your_fun)

Your code will look like this:

smx_v <- Vectorize(smx)

#This works
df1 <- df %>% 
  mutate(VarC = smx_v(varA,varB)
)

CodePudding user response:

The issue here comes from seq: when using this function inside dplyr verb, you need to make sure the length of input is 1, which isn't the case here.

Using rowwise() solves the problem:

smx <- function(x,y){
  ifelse(x==y,x,sample(seq(x,y,1),1))
}

df <- as.data.frame(cbind(c(rep(0,5)),c(seq(0,4,1))))
colnames(df) <- c("varA","varB")

df %>% 
  rowwise() %>% 
  mutate(VarC = smx(varA, varB))

Output:

# A tibble: 5 x 3
# Rowwise: 
   varA  varB  VarC
  <dbl> <dbl> <dbl>
1     0     0     0
2     0     1     1
3     0     2     1
4     0     3     2
5     0     4     0
  • Related