I'm looking to track the order in which an item falls in a sequence. For example, the final product should look something like this.
df <- data.frame(paitent=c('Sally', 'Josh', 'Josh', 'Abram','Sally', 'Josh'),
visit=mdy(c('2/10/2022', '2/11/2022', '2/12/2022', '2/13/2022', '2/14/2022', '2/15/2022')),
visit_count=c(1,1,2,1,2,3))
paitent visit
1 Sally 2022-02-10
2 Josh 2022-02-11
3 Josh 2022-02-12
4 Abram 2022-02-13
5 Sally 2022-02-14
6 Josh 2022-02-15
The 'visit_count' column would be automatically populated based on the Patient name and where it falls in sequence based on the date.
I'm not exactly sure where to go. I've looked into using the mutate and nrow() functions to count the rows but I'm having trouble finding a way to filter for the specific name and then only count the dates that are less than the current records date.
CodePudding user response:
dplyr
library(dplyr)
df %>%
group_by(paitent) %>%
mutate(visit_count2 = rank(visit, ties.method = "first")) %>%
ungroup()
# # A tibble: 6 x 4
# paitent visit visit_count visit_count2
# <chr> <date> <dbl> <int>
# 1 Sally 2022-02-10 1 1
# 2 Josh 2022-02-11 1 1
# 3 Josh 2022-02-12 2 2
# 4 Abram 2022-02-13 1 1
# 5 Sally 2022-02-14 2 2
# 6 Josh 2022-02-15 3 3
base R
df$visit_count2 <- ave(as.numeric(df$visit), df$paitent, FUN = function(z) rank(z, ties.method = "first"))
df
# paitent visit visit_count visit_count2
# 1 Sally 2022-02-10 1 1
# 2 Josh 2022-02-11 1 1
# 3 Josh 2022-02-12 2 2
# 4 Abram 2022-02-13 1 1
# 5 Sally 2022-02-14 2 2
# 6 Josh 2022-02-15 3 3
data.table
library(data.table)
as.data.table(df)[, visit_count2 := rank(visit, ties.method = "first"), by = .(paitent)]
Data
df <- structure(list(paitent = c("Sally", "Josh", "Josh", "Abram", "Sally", "Josh"), visit = structure(c(19033, 19034, 19035, 19036, 19037, 19038), class = "Date"), visit_count = c(1, 1, 2, 1, 2, 3), visit_count2 = c(1, 1, 2, 1, 2, 3)), row.names = c(NA, -6L), class = "data.frame")
CodePudding user response:
We can group by patient
, sort visit
in ascending order and then create visit_count
df%>%
group_by(paitent)%>%
arrange(visit)%>%
mutate(visit_count=row_number())
# A tibble: 6 x 3
# Groups: paitent [3]
paitent visit visit_count
<fct> <date> <int>
1 Sally 2022-02-10 1
2 Josh 2022-02-11 1
3 Josh 2022-02-12 2
4 Abram 2022-02-13 1
5 Sally 2022-02-14 2
6 Josh 2022-02-15 3
CodePudding user response:
library(dplyr)
df %>%
group_by(patient) %>%
mutate(visit_count =1:n())
patient visit visit_count
<chr> <date> <int>
1 Sally 2022-02-10 1
2 Josh 2022-02-11 1
3 Josh 2022-02-12 2
4 Abram 2022-02-13 1
5 Sally 2022-02-14 2
6 Josh 2022-02-15 3