I am working with lab results and have sequential results for 4 patients
set.seed(100)
df <- data.frame(patient = c(rep("aaa", 5), rep("bbb", 3), rep("ccc", 1), rep("ddd",2)),
seq_visit= c(1,2,3,4,5,1,2,3,1,1,2),
result = c(20,15,17,2,15,13,12,5,10,4,12))
Ultimately, I want to get to a df that has only the last 2 observations for each group and then add column with a "pre" or "post" result so I can do a paired t-test
set.seed(100)
df2 <- data.frame(patient = c(rep("aaa", 2), rep("bbb", 2), rep("ddd",2)),
seq_visit= c(4,5,2,3,1,2),
result = c(2,15,12,5,4,12),
tx = c("pre","post","pre","post","pre","post"))
I really would appreciate a tidy answer
CodePudding user response:
df %>%
group_by(patient) %>%
slice_tail(n=2) %>%
filter(n()==2) %>%
ungroup()
CodePudding user response:
A base
solution
by(df, df$patient, tail, 2) |>
Filter(f = \(x) nrow(x) > 1) |>
Reduce(f = rbind) |>
transform(tx = c("pre", "post"))
# patient seq_visit result tx
# 4 aaa 4 2 pre
# 5 aaa 5 15 post
# 7 bbb 2 12 pre
# 8 bbb 3 5 post
# 10 ddd 1 4 pre
# 11 ddd 2 12 post
CodePudding user response:
An option with data.table
library(data.table)
setDT(df)[, if(.N > 1) tail(.SD, 2), patient][,
tx := rep(c('pre', 'post'), length.out = .N)][]
patient seq_visit result tx
<char> <num> <num> <char>
1: aaa 4 2 pre
2: aaa 5 15 post
3: bbb 2 12 pre
4: bbb 3 5 post
5: ddd 1 4 pre
6: ddd 2 12 post
CodePudding user response:
Another data.table
option
setDT(df)
df[, if (.N > 1) c(last(.SD, 2), .(tx = c('pre', 'post'))), patient]
# patient seq_visit result tx
# <char> <num> <num> <char>
# 1: aaa 4 2 pre
# 2: aaa 5 15 post
# 3: bbb 2 12 pre
# 4: bbb 3 5 post
# 5: ddd 1 4 pre
# 6: ddd 2 12 post