Match Each Winner with a Unique Prize-CodePudding

In a contest, each winner and prize is assigned a random integer [1, 9] called a "ticket" number and a unique "ID" number [1111, 9999]. Each winner receives a unique prize from a limited stock of prizes based on the winner's ticket number ±1.

Question 1: Duplicate Prizes

How can I prevent the script (below) from returning duplicate prizes? I've used the duplicate() function before, but I'm unsure how to implement it in this case.

Question 2: Cannot Match a Winner with a Prize

How would I implement this rule in my script: If a non-duplicated prize cannot be found, then return a prize from the unclaimed stock that is the next closest match.

Here's what I have thus far:

# Function to generate a data frame with random parameters
generate <- function(n) {
  ID <- as.factor(sample(1111:9999, n))
  ticket <- sample(1:9, n, replace = TRUE)
  lower.bound <- ticket - 1
  upper.bound <- ticket   1
  winners.df <- cbind.data.frame(ID, ticket, lower.bound, upper.bound)
  return(winners.df)
}

# Generate a master data frame
master <- generate(20)

# Split master data frame into "prizes" and "winners"
prizes  <- master[1:16, ]
winners   <- master[17:20, ]

# Eliminate upper/lower bound columns in prizes as they are not needed
prizes <- prizes[, -c(3, 4)]

# Set an empty variable to serve as a container
picks <- list(NULL)

for (x in 1:length(winners$ID)) {
  pool <- subset(prizes, ticket >= winners$lower.bound[x] & ticket <= winners$upper.bound[x])
  picks[[x]] <- pool[sample(nrow(pool), 1), ]
}

picks <- do.call(rbind.data.frame, picks)

# Generate a summary of winners and their prizes
winners.prizes <- data.frame(winnerID = winners$ID,
                             winnerTicket = winners$ticket,
                             prizeID = picks$ID,
                             prizeTicket = picks$ticket)

CodePudding user response：

Original Answer

For question 1.

You need to remove the prize chosen from the prizes data.frame in order for them not to be picked again.

# Assign a unique prize to each winner
for (x in 1:length(winners$ID)) {
  pool <- subset(prizes, ticket >= winners$lower.bound[x] & ticket <= winners$upper.bound[x])
  
  # Assign a prize to prize var and remove it from prizes
  prize = pool[sample(nrow(pool), 1), ]
  prizes = prizes[!(prizes$ID %in% prize$ID),]
  
  picks[[x]] <- prize
}

New Answer

I've put a little more thought into this as I looked more into your code.

I would avoid using subset as it can have unintended consequences. Also it's not necessary to save your picks into a list if you're just going to transform it into a data.frame. You're better off starting with a data.frame and then updating it. Lastly, I think it may be better to include a new column that highlights whether or not the prize was chosen versus removing the chosen prize from your initial set.

One final note - I would recommend not using periods in variable names. They can be misinterpreted as S3 methods.

I set up a function to generate the winners table and a prizes table to show which were/weren't chosen. Too many variables were being created in the global env. So it makes more sense to keep this contained.

set.seed(100) # for reproducibility

# Generate a data frame with random parameters
generate <- function(n) {
  ticket <- sample(1:9, n, replace = TRUE)
  ID <- as.factor(sample(1111:9999, n))
  lower.bound <- ticket - 1
  upper.bound <- ticket   1
  winners.df <- cbind.data.frame(ID, ticket, lower.bound, upper.bound)
  return(winners.df)
}

# Generate a master data frame
master <- generate(20)

# Split master data frame into "prizes" and "winners"
hold <- sample(c(TRUE, FALSE), nrow(master), replace = TRUE, prob = c(0.75, 0.25))
prizes <- master[hold, ]
winners <- master[!hold, ]

# Eliminate upper/lower bound columns in prizes as they are not needed
prizes <- prizes[, -c(3, 4)]

winners_output = function(w, p) {
  winners_dt = data.frame()
  p$chosen = FALSE
  
  # Assign a unique prize to each winner
  for (i in 1:length(w$ID)) {
    upper_b = w$upper.bound[i] 
    lower_b = w$lower.bound[i]
    
    # Subset to only prizes not chosen and create sample pool
    avail_prizes = p[!(p$chosen),]
    pool <- avail_prizes[avail_prizes$ticket >= lower_b &avail_prizes$ticket <= upper_b,]
    
    # Assign a prize and remove it from prizes
    assigned_prize = pool[sample(nrow(pool), 1), ]
    
    # Update chosen prize to TRUE
    p$chosen[which(p$ID == assigned_prize$ID)] = TRUE
    
    # Set up data frame for each winner
    w_dt = data.frame(
      winnerID = w$ID[i],
      winnerTicket = w$ticket[i],
      prizeID = assigned_prize$ID,
      prizeTicket = assigned_prize$ticket
    )
    
    # add to full winners dt
    winners_dt = rbind(winners_dt, w_dt)
  }
  
  # return all winners plus chosen prizes
  return(list(
    winners_dt = winners_dt,
    prizes = p))
}

w = winners_output(w = winners, p = prizes)

# > w$winners_dt
#   winnerID winnerTicket prizeID prizeTicket
# 1     7578            6    2927           7
# 2     6397            7    2741           7
# 3     8655            6    3100           6
# 4     2918            6    6388           7
# 5     9907            2    1333           2
# 6     4852            7    7882           7
# 7     8174            8    7095           8
# > w$prizes
#      ID ticket chosen
# 1  2927      7   TRUE
# 3  9590      3  FALSE
# 4  9649      9  FALSE
# 8  6177      4  FALSE
# 9  7882      7   TRUE
# 10 3100      6   TRUE
# 13 6388      7   TRUE
# 14 2741      7   TRUE
# 15 7095      8   TRUE
# 16 1333      2   TRUE
# 17 9203      3  FALSE
# 18 7505      3  FALSE
# 20 6204      2  FALSE