Home > Back-end >  R- How to edit repeated records?
R- How to edit repeated records?

Time:10-13

I'm reading data from a csv. This is what my data looks like. There are some records that have the same label/question on different days. I want to add numbers to the repeated questions.

UserID Full Name  DOB     EncounterID Date   Type  label                          responses    
1      John Smith 1-1-90  13          1-1-21   Intro Check Were you given any info? (null)
1      John Smith 1-1-90  13          1-2-21   Intro Check Were you given any info?  no
1      John Smith 1-1-90  13          1-3-21   Intro Check Were you given any info?  yes
2      Jane Doe   2-2-80  14          1-6-21   Intro Check Were you given any info?  no
2      Jane Doe   2-2-80  14          1-6-21   Care  Check By using this service..   no
2      Jane Doe   2-2-80  14          1-6-21   Out   Check How satisfied are you?    unsat

Desired output (I would like to add numbers to the repeated questions as you can see below):

UserID Full Name  DOB     EncounterID Date   Type  label                          responses    
1      John Smith 1-1-90  13          1-1-21   Intro Check Were you given any info?1 (null)
1      John Smith 1-1-90  13          1-2-21   Intro Check Were you given any info?2 no
1      John Smith 1-1-90  13          1-3-21   Intro Check Were you given any info?3 yes
2      Jane Doe   2-2-80  14          1-6-21   Intro Check Were you given any info?  no
2      Jane Doe   2-2-80  14          1-6-21   Care  Check By using this service..   no
2      Jane Doe   2-2-80  14          1-6-21   Out   Check How satisfied are you?    unsat

CodePudding user response:

Here is a dplyr solution:

library(dplyr)
df %>% 
  group_by(UserID, label) %>% 
  mutate(newcol = row_number(), 
         label = if(sum(newcol)> 1) paste0(label,newcol) else label) %>%
  ungroup() %>% 
  select(-newcol)

Or more straight as suggested by r2evans (many thanks!):

library(dplyr)
df %>% 
  group_by(UserID, label) %>% 
  mutate(label=if (n() > 1) paste0(label,row_number()) else label)
  UserID Full.Name  DOB    EncounterID Date   Type  label                           responses
   <int> <chr>      <chr>        <int> <chr>  <chr> <chr>                           <chr>    
1      1 John Smith 1-1-90          13 1-1-21 Intro Check Were you given any info?1 (null)   
2      1 John Smith 1-1-90          13 1-2-21 Intro Check Were you given any info?2 no       
3      1 John Smith 1-1-90          13 1-3-21 Intro Check Were you given any info?3 yes      
4      2 Jane Doe   2-2-80          14 1-6-21 Intro Check Were you given any info?  no       
5      2 Jane Doe   2-2-80          14 1-6-21 Care  Check By using this service..   no       
6      2 Jane Doe   2-2-80          14 1-6-21 Out   Check How satisfied are you?    unsat   

data:

df <- structure(list(UserID = c(1L, 1L, 1L, 2L, 2L, 2L), Full.Name = c("John Smith", 
"John Smith", "John Smith", "Jane Doe", "Jane Doe", "Jane Doe"
), DOB = c("1-1-90", "1-1-90", "1-1-90", "2-2-80", "2-2-80", 
"2-2-80"), EncounterID = c(13L, 13L, 13L, 14L, 14L, 14L), Date = c("1-1-21", 
"1-2-21", "1-3-21", "1-6-21", "1-6-21", "1-6-21"), Type = c("Intro", 
"Intro", "Intro", "Intro", "Care", "Out"), label = c("Check Were you given any info?", 
"Check Were you given any info?", "Check Were you given any info?", 
"Check Were you given any info?", "Check By using this service..", 
"Check How satisfied are you?"), responses = c("(null)", "no", 
"yes", "no", "no", "unsat")), class = "data.frame", row.names = c(NA, 
-6L))

CodePudding user response:

Try this:

ave(dat$label, dat[c("UserID", "label")],
    FUN = function(z) if (length(z) > 1) seq_along(z) else "")
# [1] "1" "2" "3" ""  ""  "" 

which can be used as

dat$label <- paste0(dat$label,
 ave(dat$label, dat[c("UserID", "label")],
     FUN = function(z) if (length(z) > 1) seq_along(z) else "")
)
#   UserID  Full.Name    DOB EncounterID   Date  Type                           label responses
# 1      1 John Smith 1-1-90          13 1-1-21 Intro Check Were you given any info?1    (null)
# 2      1 John Smith 1-1-90          13 1-2-21 Intro Check Were you given any info?2        no
# 3      1 John Smith 1-1-90          13 1-3-21 Intro Check Were you given any info?3       yes
# 4      2   Jane Doe 2-2-80          14 1-6-21 Intro  Check Were you given any info?        no
# 5      2   Jane Doe 2-2-80          14 1-6-21  Care   Check By using this service..        no
# 6      2   Jane Doe 2-2-80          14 1-6-21   Out    Check How satisfied are you?     unsat

Data

dat <- structure(list(UserID = c(1, 1, 1, 2, 2, 2), Full.Name = c("John Smith", "John Smith", "John Smith", "Jane Doe", "Jane Doe", "Jane Doe"), DOB = c("1-1-90", "1-1-90", "1-1-90", "2-2-80", "2-2-80", "2-2-80"), EncounterID = c(13, 13, 13, 14, 14, 14), Date = c("1-1-21", "1-2-21", "1-3-21", "1-6-21", "1-6-21", "1-6-21"), Type = c("Intro", "Intro", "Intro", "Intro", "Care", "Out"), label = c("Check Were you given any info?", "Check Were you given any info?", "Check Were you given any info?", "Check Were you given any info?", "Check By using this service..", "Check How satisfied are you?"), responses = c("(null)", "no", "yes", "no", "no", "unsat")), row.names = c(NA, -6L), class = "data.frame")
  •  Tags:  
  • r
  • Related