Home > Blockchain >  generate unique names for each id number
generate unique names for each id number

Time:11-15

Here is my data

# Create the data frame.
mydataframe <- data.frame(
  emp_id = c (100,101,100,200,150,200,600,100,150,600),
  value = c(5,3,2,1,6,7,8,3,2,1)


)
# Print the data frame.         
print(mydataframe) 

I want to write a function to replace id's that occur multiple time in the id column by giving it a unique number, such as 100 will be P1, and 200 will be P2.


mydataframe %>%
  mutate(emp_id = as.integer(factor(emp_id, levels = unique(emp_id))))

mydataframe %>%
  mutate(emp_id = match(emp_id, unique(emp_id)))


library(dplyr)
mydataframe %>%
  group_by(emp_id = factor(emp_id, levels = unique(emp_id))) %>%
  mutate(emp_id = cur_group_id())

I tried all these and it's working fine. But I still want to see P1, P2 , ...etc instead of 1, 2,3.

Note: I call P1,P2, ...etc genrate names; maybe there is a better way to call this, but I just make it simple for better understanding

expected results will be 
 emp_id ID    value

 1      1 P1        5
 2      2 P2        3
 3      1 P1        2
 4      3 P3        1
 5      4 P4        6
 6      3 P3        7
 7      5 P5        8
 8      1 P1        3
 9      4 P4        2
10      5 P5        1

Thank you

CodePudding user response:

Solution using dplyr::dense_rank().

library(dplyr)

mydataframe %>%
  mutate(
    emp_id = dense_rank(emp_id),
    ID = paste0("P", emp_id)
  )
   emp_id value ID
1       1     5 P1
2       2     3 P2
3       1     2 P1
4       4     1 P4
5       3     6 P3
6       4     7 P4
7       5     8 P5
8       1     3 P1
9       3     2 P3
10      5     1 P5

Note this creates the new ids based on numerical order of the old ids, not row order.

CodePudding user response:

With tidyverse, try an "on the fly" left join with to an index of ids:

library(tidyverse)

mydataframe %>% 
    left_join(x = . ,
              y = select( . , emp_id) %>% unique() %>% 
                  mutate(id = paste0("P", row_number())), 
              by = "emp_id") %>% 
    relocate(id, .after = emp_id)

   emp_id id value
1     100 P1     5
2     101 P2     3
3     100 P1     2
4     200 P3     1
5     150 P4     6
6     200 P3     7
7     600 P5     8
8     100 P1     3
9     150 P4     2
10    600 P5     1
  •  Tags:  
  • r
  • Related