How to turn rownames of a dataframe in format col1 col3 col4
into a column of numerical vectors c(1,3,4)
? The data frame looks like:
I tired:
> bel_bpa_df <- rownames_to_column(as.data.frame(bel_bpa), var = "SNPs") %>%
mutate(SNPs = str_split(SNPs, "\\ ")) %>%
mutate(SNPs = unlist(SNPs)) %>%
mutate(SNPs = parse_number(SNPs))
Error in `mutate()`:
! Problem while computing `SNPs = unlist(SNPs)`.
✖ `SNPs` must be size 30 or 1, not 4944.
Run `rlang::last_error()` to see where the error occurred.
but I don't understand the error. Why can't I use unlist? Can anyone provide how to understand the error at a deeper level? i.e. where's my lack of knowledge about this package or how R works or how Regex works? I'm not looking for a quick fix.
CodePudding user response:
Is this what you need?
library(dplyr)
library(stringr)
test %>%
rownames_to_column() %>%
mutate(rowname = str_extract_all(rowname, "\\d "))
rowname v1 v2
1 1, 3, 67 1 3
2 4, 5, 77 3 4
3 12, 6 5 9
Here we use str_extract
to match and extract only the d
igits from the column rowname
; the comma separator is added automatically to the extracted list
Test data:
test <- data.frame(v1 = c(1,3,5),
v2 = c(3,4,9))
row.names(test) <- c("col1 col3 col67", "col4 col5 col77", "col12 col6")
EDIT:
To cast the data frame longer on rowname
, thereby being able to convert it to numeric, use unnest_longer
:
test %>%
rownames_to_column() %>%
mutate(rowname = str_extract_all(rowname, "\\d ")) %>%
unnest_longer(rowname) %>%
mutate(rowname = as.numeric(rowname))
# A tibble: 8 × 3
rowname v1 v2
<dbl> <dbl> <dbl>
1 1 1 3
2 3 1 3
3 67 1 3
4 4 3 4
5 5 3 4
6 77 3 4
7 12 5 9
8 6 5 9