I have a dataframe with 10000 rows.
Author Value
aaa 111
aaa 112
bbb 156
bbb 165
ccc 543
ccc 256
Each author has 4 rows, so I have 2500 authors.
I would like to substitute all strings into numeric values. Ideally with tidyverse
.
Expected output
Author Value
1 111
1 112
2 156
2 165
3 543
3 256
---------
2500 451
2500 234
Thanks!
CodePudding user response:
Use match
and unique
:
match(dat$Author, unique(dat$Author))
# [1] 1 1 2 2 3 3
Reassign that back to the original column or a new one, your call.
If you want to put this in a dplyr pipe, then just
dat %>%
mutate(Author = match(Author, unique(Author)))
(as akrun posted in their comment at the same time I was finishing this answer :-).
Data
dat <- structure(list(Author = c("aaa", "aaa", "bbb", "bbb", "ccc", "ccc"), Value = c(111L, 112L, 156L, 165L, 543L, 256L)), class = "data.frame", row.names = c(NA, -6L))