Home > Enterprise >  Dynamically update a variable between rows in R/dplyr
Dynamically update a variable between rows in R/dplyr

Time:11-07

I have data from multiple games of the game Battleships. It looks something like this:

> data.frame(game=c(1,1,1,1,2,2,2,2),
  position=c(2,4,3,1,1,2,3,4),
  hit=c(0,0,1,0,1,0,0,0))

  game position hit
1    1        2   0
2    1        4   0
3    1        3   1
4    1        1   0
5    2        1   1
6    2        2   0
7    2        3   0
8    2        4   0

Each row is one move of a player: position is the square number and hit indicates whether they hit a ship or not. I want to create an additional row that would indicate the state of the board before each move. It should look something like this:

  game position hit board_state
1    1        2   0 [NA,NA,NA,NA]
2    1        4   0 [NA,0,NA,NA]
3    1        3   1 [NA,0,NA,0]
4    1        1   0 [NA,0,1,0]
5    2        1   1 [NA,NA,NA,NA]
6    2        2   0 [1,NA,NA,NA]
7    2        3   0 [1,0,NA,NA]
8    2        4   0 [1,0,0,NA]

So the board state is updated based on the position and outcome of the last step.

What I find challenging here is that the definition of board_state on row r depends on its state on row r-1, and lag is not useful here because it's within the same column. I hope this is clear.

Any ideas for how to implement this? Thanks!!!

CodePudding user response:

Here's an approach using dplyr and tidyr. First, I add a column to track which turn we're on within each game. (This will help with a later step when we pivot wide.) Then I "complete" the table to have a row for every position and every turn within each game. Then we can "fill" down and shift back one step so that each position reflects its past history. Finally we can group by game and turn and generate a "board_position" summary that concatenates the "hit" values for each position.

library(dplyr); library(tidyr)
df %>%
  group_by(game) %>%
  mutate(turn = row_number()) %>%
  complete(game, position, turn) %>%
  group_by(game, position) %>%
  fill(hit) %>%
  mutate(hit = lag(hit)) %>%
  group_by(game, turn) %>%
  summarize(board_state = paste(hit, collapse = ", "), .groups = "drop")

Result

# A tibble: 8 × 3
   game  turn board_state   
  <dbl> <int> <chr>         
1     1     1 NA, NA, NA, NA
2     1     2 NA, 0, NA, NA 
3     1     3 NA, 0, NA, 0  
4     1     4 NA, 0, 1, 0   
5     2     1 NA, NA, NA, NA
6     2     2 1, NA, NA, NA 
7     2     3 1, 0, NA, NA  
8     2     4 1, 0, 0, NA  

CodePudding user response:

I ended up with the following solution:

create_board_states <- function(positions,hits) {
  
  board_state = rep(NA,4)
  board_states = c();
  for (p in seq_along(positions)) {
    board_states = c(board_states, paste(board_state,collapse=','));
    board_state[positions[p]]=hits[p];
  }
  
  return(board_states)
}

df %>% 
  group_by(game) %>% 
  summarise(board_state=create_board_states(position,hit)) %>% 
  mutate(board_state=as.list(strsplit(board_state,',')))
  • Related