Home > Mobile >  Weird behavior with `mapply` [R]
Weird behavior with `mapply` [R]

Time:03-16

I think I'm confusing something with mapply usage but I cannot see what. I'm trying to categorize multiple groups according a specific cutoff for each one...

> dt <- source("https://pastebin.com/raw/pX0XVBSB")$value
> dt$aux <- mapply(x = unique(dt$group), y = c(rep(0.02, 2), rep(0.2, 4)), 
                 function(x,y){
                   ifelse(dt$var[dt$group == x] < x, 0, 1)
                   }) %>% unlist
> head(dt[is.na(dt$var),])
#     group var aux
# 52     g3  NA   0
# 66     g4  NA   0
# 287    g3  NA   0
# 336    g3  NA   0
# 337    g3  NA   0
# 363    g6  NA   0

... but something is happening with NAs, I expected that var = NA would be also NA (the rest of values are correctly categorized).

Any idea about what I'm doing wrong, please?

EDIT

I'd expect a correct var categorization: 0 if below the specific cutoff and 1 if equal or higher.

#   group   var aux
# 1    g1 0.010   0 #below cutoff for g1, 0.02
# 2    g1 0.210   1 #above cutoff for g1, 0.02
# 3    g1 0.021   1
# 4    g1 0.021   1
# 5    g3 0.001   0 #below cutoff for g3, 0.2
# 6    g3 3.100   1 #above cutoff for g3, 0.2

CodePudding user response:

Is this what you are looking for? It would be easier to just set up two conditional statements:

library(tidyverse)

dt <- source("https://pastebin.com/raw/pX0XVBSB")$value |>
  as_tibble()


dt |>
  mutate(aux = case_when(
    group %in% c("g1", "g2") ~ ifelse(var < 0.02, 0, 1),
    T ~ ifelse(var < 0.2, 0, 1)
  ))
#> # A tibble: 512 x 3
#>    group    var   aux
#>    <chr>  <dbl> <dbl>
#>  1 g1    0.01       0
#>  2 g1    0.01       0
#>  3 g1    0          0
#>  4 g1    0          0
#>  5 g1    0.021      1
#>  6 g1    0.021      1
#>  7 g1    0.0008     0
#>  8 g1    0.0008     0
#>  9 g1    0.0014     0
#> 10 g1    0.0014     0
#> # ... with 502 more rows

EDIT

Here is a base R way

dt$aux <- ifelse(dt$group %in% c("g1", "g2"), 
                 ifelse(dt$var < 0.02, 0, 1),
                 ifelse(dt$var < 0.2, 0, 1))

head(dt)
#> # A tibble: 6 x 3
#>   group   var   aux
#>   <chr> <dbl> <dbl>
#> 1 g1    0.01      0
#> 2 g1    0.01      0
#> 3 g1    0         0
#> 4 g1    0         0
#> 5 g1    0.021     1
#> 6 g1    0.021     1

EDIT 2

library(tidyverse)

vals <- map2(unique(dt$group), 
             c(rep(0.02, 2), rep(0.2, 4)),
             \(x,y) (ifelse(dt[dt$group == x,"var"] < y, 0, 1))) |>
  unlist()

dt|>
  arrange(group) |>
  mutate(aux = vals)
#> # A tibble: 512 x 3
#>    group    var   aux
#>    <chr>  <dbl> <dbl>
#>  1 g1    0.01       0
#>  2 g1    0.01       0
#>  3 g1    0          0
#>  4 g1    0          0
#>  5 g1    0.021      1
#>  6 g1    0.021      1
#>  7 g1    0.0008     0
#>  8 g1    0.0008     0
#>  9 g1    0.0014     0
#> 10 g1    0.0014     0
#> # ... with 502 more rows

The problem with this method is that you arrange the values in a different order than is in your dataset, so you need to resort the data before you add the new variable to the dataset.

  • Related