Home > OS >  "map" function repeats on first match in tibble, not returning new value for each row
"map" function repeats on first match in tibble, not returning new value for each row

Time:10-09

Earlier, I asked how to create [global environment objects from a tibble 1, which I have been able to do.

Now, I want to take the objects generated from the tibble to use in a function similar to this:

positives = function(pos_tibble){
  data_tibble %>%
  filter(str_detect(Animal_ID, "[:digit:]{2,3}")) %>%
  mutate(Positive = if_else(U_mL> cutoff(x), TRUE, FALSE))
}

where cutoff(x) is as follows:

cutoff <- function(x){
  if_else("Species_ab" == "Mouse" & "Protein" == "GP", return(Mouse_IgG_GP_cutoff), 
          (if_else("Species_ab" == "Mouse" & "Protein" == "NP", return(Mouse_IgG_NP_cutoff),
                   (if_else("Species_ab" == "Rat" & "Protein" == "GP", return(Mouse_IgG_GP_cutoff),
                            return(Rat_IgG_NP_cutoff))))))
}

so that pos_tibble would look like this:

# Groups:   Species_ab, Antibody, Protein, Positive [6]
  Animal_ID Species_ab Antibody Protein   U_mL Positive
  <chr>     <chr>      <chr>    <chr>    <dbl> <lgl>   
1 35        Mouse      IgG      GP      128.   TRUE    
2 39        Mouse      IgG      GP        4.75 FALSE   
3 43        Mouse      IgG      GP       NA    FALSE   
4 186       Mouse      IgG      NP        5.57 FALSE   
5 187       Mouse      IgG      NP        8.25 FALSE   
6 44        Rat        IgG      GP       NA    FALSE   
7 45        Rat        IgG      GP        9.49 TRUE    
8 322       Rat        IgG      NP       NA    FALSE   
9 323       Rat        IgG      NP       NA    FALSE 

I have tried mapping cutoff() multiple different ways, and cannot get it to select based off the conditions. It only repeats the first value matched.

by(neg_tibble, seq_len(nrow(neg_tibble)), cutoff)   

seq_len(nrow(neg_tibble)): 1
[1] 28.20336
----------------------------------------------------------- 
seq_len(nrow(neg_tibble)): 2
[1] 28.20336
----------------------------------------------------------- 
seq_len(nrow(neg_tibble)): 3
[1] 28.20336
----------------------------------------------------------- 
seq_len(nrow(neg_tibble)): 4
[1] 28.20336

map(seq_len(nrow(neg_tibble)), cutoff)  

[[1]]
[1] 28.20336

[[2]]
[1] 28.20336

[[3]]
[1] 28.20336

[[4]]
[1] 28.20336

map_dbl(seq_len(nrow(neg_tibble)), cutoff)

[1] 28.20336 28.20336 28.20336 28.20336

cutoff(neg_tibble)

[1] 28.20336 

apply(neg_tibble,1,cutoff)

[1] 28.20336 28.20336 28.20336 28.20336

The above were taken from various other questions a, b, c, d At the end of the day, I think map is what I want, but I need it to go through all my options, not repeat the first one.


Additionally, I would love any recommendations for online resources (so I can not post as many questions!!!). I have been using r4ds and software carpentry which have been helping a lot, though I'd like something a bit more comprehensive. I'm in the last two-ish months of my PhD program and am really only now getting into R a little bit of python to compile and clean all of my data into one a single tidy file, which I can then quickly run basic stats on (but I need it totally cleaned to get to that point).

Thank you!

EDIT dput of a small sample of pos_tibble

    > dput(pos_tibble[c(4:6, 300:301, 500:501, 700:701),])
structure(list(Animal_ID = c("35", "39", "43", "Blank_Use", "2x Blank", 
"525", "526", "44", "45"), Species_ab = c("Mouse", "Mouse", "Mouse", 
"Mouse", "Mouse", "Mouse", "Mouse", "Rat", "Rat"), Antibody = c("IgG", 
"IgG", "IgG", "IgG", "IgG", "IgG", "IgG", "IgG", "IgG"), Protein = c("GP", 
"GP", "GP", "NP", "NP", "NP", "NP", "NP", "NP"), U_mL = c(128.424549479298, 
4.75300605155077, NA, 11.2504537822992, 78.8606956731523, 39.7199412048613, 
NA, NA, NA)), row.names = c(NA, -9L), class = c("tbl_df", "tbl", 
"data.frame"))

Created on 2021-10-08 by the reprex package (v2.0.1)

CodePudding user response:

We may get those objects from the global env with mget, stack it to a two column data.frame

keydat <- stack(mget(ls(pattern = "_cutoff$")))[2:1]

then do a join by creating a column that matches the objects in the global env by pasteing the 'Species_ab', 'Antibody', 'Protein', and do a left_join and create the 'Positive' by just comparing the 'ave_neg_U_ml' with 'values' column

library(dplyr)
library(stringr)
neg_tibble %>%
     ungroup %>%
      mutate(ind = str_c(Species_ab, Antibody, Protein,  "cutoff", sep="_")) %>%
     left_join(keydat) %>%
      mutate(Positive = ave_neg_U_mL > values)

-output

# A tibble: 4 × 8
  Species_ab Antibody Protein ave_neg_U_mL     n ind                 values Positive
  <chr>      <chr>    <chr>          <dbl> <int> <chr>                <dbl> <lgl>   
1 Mouse      IgG      GP             28.2      6 Mouse_IgG_GP_cutoff  28.2  TRUE    
2 Mouse      IgG      NP             45.9      6 Mouse_IgG_NP_cutoff  45.9  FALSE   
3 Rat        IgG      GP              5.24     4 Rat_IgG_GP_cutoff     5.24 FALSE   
4 Rat        IgG      NP              1.41     1 Rat_IgG_NP_cutoff     1.41 TRUE    
  •  Tags:  
  • r
  • Related