Home > Blockchain >  Assigning labels to different group assignment scenarios
Assigning labels to different group assignment scenarios

Time:12-24

I have a data.frame that assigns ids to groups. In the simplest scenario each id is assigned to a different group:

df1 <- data.frame(group = c("a1","a2"),
                  id = c("i1","i2"),
                  stringsAsFactors = F)

In a second scenario all ids are assigned to one group:

df2 <- data.frame(group = c("a1","a1"),
                  id = c("i1","i2"),
                  stringsAsFactors = F)

And in the third scenario there's ambiguous id to group assignment:

df3 <- data.frame(group = c("a1","a2","a2"),
                  id = c("i1","i1","i2"),
                  stringsAsFactors = F)

I'm looking for a function that would return a label "scenario1"/"scenario2"/"scenario3" given such a data.frame with the id and group columns, according to the scenarios above.

In other words, this function would return "scenario1" for df1, "scenario2" for df2, and "scenario3" for df3

Obviously this can be done with if statements but I'm hoping for something faster using dplyr/tidyverse or data.table

CodePudding user response:

Here's a function to check different conditions.

library(dplyr)

return_scenario <- function(df) {
  tmp <- df %>% distinct(group, id)
  case_when(
    n_distinct(tmp$group) == 1 ~ 'scenario 2',
    n_distinct(tmp$id) == nrow(tmp) ~ 'scenario 1', 
    TRUE ~ 'scenario 3') 
}

return_scenario(df1)  
#[1] "scenario 1"  
return_scenario(df2) 
#[1] "scenario 2"   
return_scenario(df3)  
#[1] "scenario 3"

If needed, this can also be translated in base R/data.table with their equivalent functions.

  • Related