In my data set, their is column of full names (eg: below) and I want to add the another column next to it mentioning if a name has appeared two one, two, three, four.... times using R. My output should look like the column below: Number of repetition.
Eg: Data set name: People
Full name Number of repetition
Peter 1
Peter 2
Alison
Warren
Jack 1
Jack 2
Jack 3
Jack 4
Susan 1
Susan 2
Henry 1
Walison
Tinder 1
Peter 3
Henry 2
Tinder 2
Thanks
Teena
CodePudding user response:
Here is one way. Do a group by 'Fullname', and create the sequence with row_number()
if the number of rows is greater than 1. By default, case_when
returns the other case as NA
library(dplyr)
df1 <- df1 %>%
group_by(Fullname) %>%
mutate(Number_of_repetition = case_when(n() > 1 ~ row_number())) %>%
ungroup
-output
df1
# A tibble: 16 × 2
Fullname Number_of_repetition
<chr> <int>
1 Peter 1
2 Peter 2
3 Alison NA
4 Warren NA
5 Jack 1
6 Jack 2
7 Jack 3
8 Jack 4
9 Susan 1
10 Susan 2
11 Henry 1
12 Walison NA
13 Tinder 1
14 Peter 3
15 Henry 2
16 Tinder 2
If we need to add a third column, use unite
on the updated data from previous step
library(tidyr)
df1 %>%
unite(FullNameRep, Fullname, Number_of_repetition, sep="", na.rm = TRUE, remove = FALSE)
-output
# A tibble: 16 × 3
FullNameRep Fullname Number_of_repetition
<chr> <chr> <int>
1 Peter1 Peter 1
2 Peter2 Peter 2
3 Alison Alison NA
4 Warren Warren NA
5 Jack1 Jack 1
6 Jack2 Jack 2
7 Jack3 Jack 3
8 Jack4 Jack 4
9 Susan1 Susan 1
10 Susan2 Susan 2
11 Henry1 Henry 1
12 Walison Walison NA
13 Tinder1 Tinder 1
14 Peter3 Peter 3
15 Henry2 Henry 2
16 Tinder2 Tinder 2
data
df1 <- structure(list(Fullname = c("Peter", "Peter", "Alison", "Warren",
"Jack", "Jack", "Jack", "Jack", "Susan", "Susan", "Henry", "Walison",
"Tinder", "Peter", "Henry", "Tinder")), row.names = c(NA, -16L
), class = "data.frame")
CodePudding user response:
Here is an alternative way solved with help from akrun: sum() condition in ifelse statement
library(dplyr)
df1 %>%
group_by(Fullname) %>%
mutate(newcol = row_number(),
newcol = if(sum(newcol)> 1) newcol else NA) %>%
ungroup
Fullname newcol
<chr> <int>
1 Peter 1
2 Peter 2
3 Alison NA
4 Warren NA
5 Jack 1
6 Jack 2
7 Jack 3
8 Jack 4
9 Susan 1
10 Susan 2
11 Henry 1
12 Walison NA
13 Tinder 1
14 Peter 3
15 Henry 2
16 Tinder 2