generate unique names for each id number-CodePudding

Here is my data

# Create the data frame.
mydataframe <- data.frame(
  emp_id = c (100,101,100,200,150,200,600,100,150,600),
  value = c(5,3,2,1,6,7,8,3,2,1)


)
# Print the data frame.         
print(mydataframe)

I want to write a function to replace id's that occur multiple time in the id column by giving it a unique number, such as 100 will be P1, and 200 will be P2.


mydataframe %>%
  mutate(emp_id = as.integer(factor(emp_id, levels = unique(emp_id))))

mydataframe %>%
  mutate(emp_id = match(emp_id, unique(emp_id)))


library(dplyr)
mydataframe %>%
  group_by(emp_id = factor(emp_id, levels = unique(emp_id))) %>%
  mutate(emp_id = cur_group_id())

I tried all these and it's working fine. But I still want to see P1, P2 , ...etc instead of 1, 2,3.

Note: I call P1,P2, ...etc genrate names; maybe there is a better way to call this, but I just make it simple for better understanding

expected results will be 
 emp_id ID    value

 1      1 P1        5
 2      2 P2        3
 3      1 P1        2
 4      3 P3        1
 5      4 P4        6
 6      3 P3        7
 7      5 P5        8
 8      1 P1        3
 9      4 P4        2
10      5 P5        1

Thank you

CodePudding user response：

Solution using dplyr::dense_rank().

library(dplyr)

mydataframe %>%
  mutate(
    emp_id = dense_rank(emp_id),
    ID = paste0("P", emp_id)
  )

   emp_id value ID
1       1     5 P1
2       2     3 P2
3       1     2 P1
4       4     1 P4
5       3     6 P3
6       4     7 P4
7       5     8 P5
8       1     3 P1
9       3     2 P3
10      5     1 P5

Note this creates the new ids based on numerical order of the old ids, not row order.

CodePudding user response：

With tidyverse, try an "on the fly" left join with to an index of ids:

library(tidyverse)

mydataframe %>% 
    left_join(x = . ,
              y = select( . , emp_id) %>% unique() %>% 
                  mutate(id = paste0("P", row_number())), 
              by = "emp_id") %>% 
    relocate(id, .after = emp_id)

   emp_id id value
1     100 P1     5
2     101 P2     3
3     100 P1     2
4     200 P3     1
5     150 P4     6
6     200 P3     7
7     600 P5     8
8     100 P1     3
9     150 P4     2
10    600 P5     1