I have a dataframe like so:
subjectid <- c(1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5)
response <- c("PD", "PD", "SD", "PD", "SD", "PD", "SD", "SD", "SD", "PD", "PR")
df <- data.frame(subjectid, response)
I want to count the amount of times PD SD and PR occurs per subjectid. So for subject 1 the first time PD occurs I want the value 1. The second time it occurs for subjectid = 1, I want the value 2. The catch is I want it to restart at 1 for subjectid 2. for the first time PD occurs for subjectid = 2 I want the value 1. I also want my new value variable to paste in the response before the number. My desired output is as follows:
Any help would be much apricated!
CodePudding user response:
dplyr
library(dplyr)
df %>%
group_by(subjectid, response) %>%
mutate(value = paste(response, row_number())) %>%
ungroup()
# # A tibble: 11 x 3
# subjectid response value
# <dbl> <chr> <chr>
# 1 1 PD PD 1
# 2 1 PD PD 2
# 3 1 SD SD 1
# 4 2 PD PD 1
# 5 2 SD SD 1
# 6 3 PD PD 1
# 7 3 SD SD 1
# 8 3 SD SD 2
# 9 4 SD SD 1
# 10 4 PD PD 1
# 11 5 PR PR 1
data.table
library(data.table)
DT <- as.data.table(df) # canonical method is setDF(df)
DT[, value := paste(response, rowid(subjectid, response))]
base R
df$value <- ave(df$response, df[c("subjectid", "response")],
FUN = function(z) paste(z, seq_along(z)))
CodePudding user response:
Try this
library(dplyr)
df |> group_by(subjectid, response) |>
mutate(value = paste0(response ," ", 1:n())) |>
ungroup()
- output
# A tibble: 11 × 3
subjectid response value
<dbl> <chr> <chr>
1 1 PD PD 1
2 1 PD PD 2
3 1 SD SD 1
4 2 PD PD 1
5 2 SD SD 1
6 3 PD PD 1
7 3 SD SD 1
8 3 SD SD 2
9 4 SD SD 1
10 4 PD PD 1
11 5 PR PR 1