I have a data frame in this format:
Group | Observation |
---|---|
a | 1 |
a | 2 |
a | 3 |
b | 4 |
b | 5 |
c | 6 |
c | 7 |
c | 8 |
I want to create a unique ID column which considers both group and each unique observation within it, so that it is formatted like so:
Group | Observation | Unique_ID |
---|---|---|
a | 1 | 1.1 |
a | 2 | 1.2 |
a | 3 | 1.3 |
b | 4 | 2.1 |
b | 5 | 2.2 |
c | 6 | 3.1 |
c | 7 | 3.2 |
c | 8 | 3.3 |
Does anyone know of any syntax or functions to accomplish this? The formatting does not need to exactly match '1.1' as long as it signifies group and each unique observation within it. Thanks in advance
CodePudding user response:
Another way using cur_group_id
and row_number
library(dplyr)
A <- 'Group Observation
a 1
a 2
a 3
b 4
b 5
c 6
c 7
c 8'
df <- read.table(textConnection(A), header = TRUE)
df |>
group_by(Group) |>
mutate(Unique_ID = paste0(cur_group_id(), ".", row_number())) |>
ungroup()
Group Observation Unique_ID
<chr> <int> <chr>
1 a 1 1.1
2 a 2 1.2
3 a 3 1.3
4 b 4 2.1
5 b 5 2.2
6 c 6 3.1
7 c 7 3.2
8 c 8 3.3
CodePudding user response:
library(tidyverse)
df <- read_table("Group Observation
a 1
a 2
a 3
b 4
b 5
c 6
c 7
c 8")
df %>%
mutate(unique = Group %>%
as.factor() %>%
as.integer() %>%
paste(., Observation, sep = "."))
#> # A tibble: 8 x 3
#> Group Observation unique
#> <chr> <dbl> <chr>
#> 1 a 1 1.1
#> 2 a 2 1.2
#> 3 a 3 1.3
#> 4 b 4 2.4
#> 5 b 5 2.5
#> 6 c 6 3.6
#> 7 c 7 3.7
#> 8 c 8 3.8
Created on 2022-07-12 by the reprex package (v2.0.1)
CodePudding user response:
Try this
df |> group_by(Group) |>
mutate(Unique_ID = paste0(cur_group_id(),".",1:n()))
- output
# A tibble: 8 × 3
# Groups: Group [3]
Group Observation Unique_ID
<chr> <int> <chr>
1 a 1 1.1
2 a 2 1.2
3 a 3 1.3
4 b 4 2.1
5 b 5 2.2
6 c 6 3.1
7 c 7 3.2
8 c 8 3.3