Home > Mobile >  My question is about R: How to number each repetition in a table in R?
My question is about R: How to number each repetition in a table in R?

Time:10-07

In my data set, their is column of full names (eg: below) and I want to add the another column next to it mentioning if a name has appeared two one, two, three, four.... times using R. My output should look like the column below: Number of repetition.

Eg: Data set name: People Full name Number of repetition Peter 1 Peter 2 Alison
Warren
Jack 1 Jack 2 Jack 3 Jack 4 Susan 1 Susan 2 Henry 1 Walison Tinder 1 Peter 3 Henry 2 Tinder 2 Thanks Teena

CodePudding user response:

Here is one way. Do a group by 'Fullname', and create the sequence with row_number() if the number of rows is greater than 1. By default, case_when returns the other case as NA

library(dplyr)
df1 <- df1 %>%
   group_by(Fullname) %>%
   mutate(Number_of_repetition = case_when(n() > 1 ~ row_number())) %>%
   ungroup

-output

df1
# A tibble: 16 × 2
   Fullname Number_of_repetition
   <chr>                   <int>
 1 Peter                       1
 2 Peter                       2
 3 Alison                     NA
 4 Warren                     NA
 5 Jack                        1
 6 Jack                        2
 7 Jack                        3
 8 Jack                        4
 9 Susan                       1
10 Susan                       2
11 Henry                       1
12 Walison                    NA
13 Tinder                      1
14 Peter                       3
15 Henry                       2
16 Tinder                      2

If we need to add a third column, use unite on the updated data from previous step

library(tidyr)
df1 %>%
   unite(FullNameRep, Fullname, Number_of_repetition, sep="", na.rm = TRUE, remove = FALSE)

-output

# A tibble: 16 × 3
   FullNameRep Fullname Number_of_repetition
   <chr>       <chr>                   <int>
 1 Peter1      Peter                       1
 2 Peter2      Peter                       2
 3 Alison      Alison                     NA
 4 Warren      Warren                     NA
 5 Jack1       Jack                        1
 6 Jack2       Jack                        2
 7 Jack3       Jack                        3
 8 Jack4       Jack                        4
 9 Susan1      Susan                       1
10 Susan2      Susan                       2
11 Henry1      Henry                       1
12 Walison     Walison                    NA
13 Tinder1     Tinder                      1
14 Peter3      Peter                       3
15 Henry2      Henry                       2
16 Tinder2     Tinder                      2

data

df1 <- structure(list(Fullname = c("Peter", "Peter", "Alison", "Warren", 
"Jack", "Jack", "Jack", "Jack", "Susan", "Susan", "Henry", "Walison", 
"Tinder", "Peter", "Henry", "Tinder")), row.names = c(NA, -16L
), class = "data.frame")

CodePudding user response:

Here is an alternative way solved with help from akrun: sum() condition in ifelse statement

library(dplyr)
df1 %>% 
  group_by(Fullname) %>% 
  mutate(newcol = row_number(), 
         newcol = if(sum(newcol)> 1) newcol else NA) %>%
  ungroup
   Fullname newcol
   <chr>     <int>
 1 Peter         1
 2 Peter         2
 3 Alison       NA
 4 Warren       NA
 5 Jack          1
 6 Jack          2
 7 Jack          3
 8 Jack          4
 9 Susan         1
10 Susan         2
11 Henry         1
12 Walison      NA
13 Tinder        1
14 Peter         3
15 Henry         2
16 Tinder        2
  • Related