Home > other >  Apply function to each row, with condition
Apply function to each row, with condition

Time:12-16

I don't understand why this piece of code does not work. I feel like I'm missing a very simple concept.

#if(!require(tidyverse)) install.packages("tidyverse")

df <- tibble(rule = c("add", "add", "add", "sub"), var1 = c(1, 2, 3, 4), var2 = c(2, 3, 4, 5), result = 0)

simple_as_hello_world <- function(rule, var1, var2) {
  if (rule == "add") {result = var1   var2}
  else {result = var1 - var2}
}

map(.x = df, .f = simple_as_hello_world)

CodePudding user response:

There's a solution with the dplyr package but setting the columns var1 and var2 as numerical, not character. Just remove the quotation marks, or use as.numeric() function.

library(dplyr)

df <- tibble(rule = c("add", "add", "add", "sub"), 
             var1 = c(1, 2, 3, 4), 
             var2 = c(2, 3, 4, 5))

df %>% 
  mutate(result = ifelse(rule == "add", var1   var2, var1 - var2))
# A tibble: 4 x 4
  rule   var1  var2 result
  <chr> <dbl> <dbl>  <dbl>
1 add       1     2      3
2 add       2     3      5
3 add       3     4      7
4 sub       4     5     -1

CodePudding user response:

map as noted works over lists or vectors. Also, your function takes three arguments, so pmap would be best to pass these. You can either make your df a list to map over (works same as passing tibble - list output in both cases of only the result):

library(tidyverse)

df <-
  list(
    rule = c("add", "add", "add", "sub"),
    var1 = c("1", "2", "3", "4"),
    var2 = c("2", "3", "4", "5")
  )

simple_as_hello_world <- function(rule, var1, var2) {
  if (rule == "add") {
    result = as.integer(var1)   as.integer(var2)
  }
  else {
    result = as.integer(var1) - as.integer(var2)
  }
  result
}

pmap(df, .f = simple_as_hello_world)
#> [[1]]
#> [1] 3
#> 
#> [[2]]
#> [1] 5
#> 
#> [[3]]
#> [1] 7
#> 
#> [[4]]
#> [1] -1

Or use within mutate to pass relevant elements to your function:

df <-
  tibble(
    rule = c("add", "add", "add", "sub"),
    var1 = c("1", "2", "3", "4"),
    var2 = c("2", "3", "4", "5")
  )

df |>
  mutate(result = pmap_int(list(rule, var1, var2), simple_as_hello_world))
#> # A tibble: 4 × 4
#>   rule  var1  var2  result
#>   <chr> <chr> <chr>  <int>
#> 1 add   1     2          3
#> 2 add   2     3          5
#> 3 add   3     4          7
#> 4 sub   4     5         -1

CodePudding user response:

pmap(df[,-4], simple_as_hello_world)

map works for a vector, whereas pmap (parallel map) iterates over multiple vectors (in this case the columns of the data frame) and runs the function for x[1], y[1], z[1]; then x[2],y[2],z[2] etc.

also, your "var1" and "var2" are not numbers, they are strings. convert to numeric before trying to add them.

  • Related