Home > front end >  Continue row numbers from one data frame on separate data frame
Continue row numbers from one data frame on separate data frame

Time:01-21

I have two data frames, df1 and df2:

df1<-structure(list(protocol_no = c("study5", "study5", 
"study5", "study5", "study5", "study5","study6"), 
    sequence_number = c("1", "15", "73", "42", "2", "9","5021")), row.names = c(NA, 
-7L), class = c("tbl_df", "tbl", "data.frame"))

df2<-structure(list(record_id = c(11, 12, 13, 14, 15, 16), protocol_no = c("study5", 
"study5", "study5", "study5", "study5", "study5"
), sequence_number = c("1", "15", "73", "42", "2", "9"), form_1_complete = c(0, 
0, 0, 0, 0, 0)), row.names = c(NA, 6L), class = "data.frame")

You can kinda ignore whats in these, i just made up some names and numbers, the key points here are that df2 has more columns than df1, and the real data sets will have 27,000 rows.

df1 will always have slightly more rows than df2 because it has newer data.

What I'm trying to do is find which rows exist in df1 that don't exist in df2 and isolate them. I know I could do this with anti-join(), the problem is that I also want to include the "record_id" column from df2, and I want it to start numbering from wherever df2 left off.

So in this case, the row of df1 that is "study 6, 5021" would be the 'new' row, and it would be numbered record_id = 17 (because thats where df2 left off), and my output would look like this:

enter image description here

CodePudding user response:

We could bind the data, get the distinct rows and update the record_id

library(dplyr)
library(tidyr)
library(data.table)
bind_rows(df2, df1) %>% 
  distinct(protocol_no, sequence_number, .keep_all = TRUE) %>% 
  fill(record_id, form_1_complete) %>% 
  mutate(record_id = record_id    (rowid(record_id) - 1))

-output

  record_id protocol_no sequence_number form_1_complete
1        11      study5               1               0
2        12      study5              15               0
3        13      study5              73               0
4        14      study5              42               0
5        15      study5               2               0
6        16      study5               9               0
7        17      study6            5021               0
  • Related