I don't understand why this piece of code does not work. I feel like I'm missing a very simple concept.
#if(!require(tidyverse)) install.packages("tidyverse")
df <- tibble(rule = c("add", "add", "add", "sub"), var1 = c(1, 2, 3, 4), var2 = c(2, 3, 4, 5), result = 0)
simple_as_hello_world <- function(rule, var1, var2) {
if (rule == "add") {result = var1 var2}
else {result = var1 - var2}
}
map(.x = df, .f = simple_as_hello_world)
CodePudding user response:
There's a solution with the dplyr
package but setting the columns var1 and var2 as numerical, not character. Just remove the quotation marks, or use as.numeric()
function.
library(dplyr)
df <- tibble(rule = c("add", "add", "add", "sub"),
var1 = c(1, 2, 3, 4),
var2 = c(2, 3, 4, 5))
df %>%
mutate(result = ifelse(rule == "add", var1 var2, var1 - var2))
# A tibble: 4 x 4
rule var1 var2 result
<chr> <dbl> <dbl> <dbl>
1 add 1 2 3
2 add 2 3 5
3 add 3 4 7
4 sub 4 5 -1
CodePudding user response:
map
as noted works over lists or vectors. Also, your function takes three arguments, so pmap
would be best to pass these. You can either make your df
a list to map over (works same as passing tibble - list output in both cases of only the result):
library(tidyverse)
df <-
list(
rule = c("add", "add", "add", "sub"),
var1 = c("1", "2", "3", "4"),
var2 = c("2", "3", "4", "5")
)
simple_as_hello_world <- function(rule, var1, var2) {
if (rule == "add") {
result = as.integer(var1) as.integer(var2)
}
else {
result = as.integer(var1) - as.integer(var2)
}
result
}
pmap(df, .f = simple_as_hello_world)
#> [[1]]
#> [1] 3
#>
#> [[2]]
#> [1] 5
#>
#> [[3]]
#> [1] 7
#>
#> [[4]]
#> [1] -1
Or use within mutate
to pass relevant elements to your function:
df <-
tibble(
rule = c("add", "add", "add", "sub"),
var1 = c("1", "2", "3", "4"),
var2 = c("2", "3", "4", "5")
)
df |>
mutate(result = pmap_int(list(rule, var1, var2), simple_as_hello_world))
#> # A tibble: 4 × 4
#> rule var1 var2 result
#> <chr> <chr> <chr> <int>
#> 1 add 1 2 3
#> 2 add 2 3 5
#> 3 add 3 4 7
#> 4 sub 4 5 -1
CodePudding user response:
pmap(df[,-4], simple_as_hello_world)
map works for a vector, whereas pmap (parallel map) iterates over multiple vectors (in this case the columns of the data frame) and runs the function for x[1], y[1], z[1]; then x[2],y[2],z[2] etc.
also, your "var1" and "var2" are not numbers, they are strings. convert to numeric before trying to add them.