Home > Net >  How to turn rownames of a dataframe in format `col1 col3 col4` into a column of numerical vector
How to turn rownames of a dataframe in format `col1 col3 col4` into a column of numerical vector


How to turn rownames of a dataframe in format col1 col3 col4 into a column of numerical vectors c(1,3,4)? The data frame looks like:


I tired:

> bel_bpa_df <- rownames_to_column(as.data.frame(bel_bpa), var = "SNPs") %>% 
    mutate(SNPs = str_split(SNPs, "\\ ")) %>%
    mutate(SNPs = unlist(SNPs)) %>%
    mutate(SNPs = parse_number(SNPs))
Error in `mutate()`:
! Problem while computing `SNPs = unlist(SNPs)`.
✖ `SNPs` must be size 30 or 1, not 4944.
Run `rlang::last_error()` to see where the error occurred.

but I don't understand the error. Why can't I use unlist? Can anyone provide how to understand the error at a deeper level? i.e. where's my lack of knowledge about this package or how R works or how Regex works? I'm not looking for a quick fix.

CodePudding user response:

Is this what you need?

test %>% 
  rownames_to_column() %>%
  mutate(rowname = str_extract_all(rowname, "\\d "))
   rowname v1 v2
1 1, 3, 67  1  3
2 4, 5, 77  3  4
3    12, 6  5  9

Here we use str_extract to match and extract only the digits from the column rowname; the comma separator is added automatically to the extracted list

Test data:

test <- data.frame(v1 = c(1,3,5),
                   v2 = c(3,4,9))
row.names(test) <- c("col1   col3   col67", "col4   col5   col77", "col12   col6")


To cast the data frame longer on rowname, thereby being able to convert it to numeric, use unnest_longer:

test %>% 
  rownames_to_column() %>%
  mutate(rowname = str_extract_all(rowname, "\\d ")) %>% 
  unnest_longer(rowname) %>%
  mutate(rowname = as.numeric(rowname))
# A tibble: 8 × 3
  rowname    v1    v2
    <dbl> <dbl> <dbl>
1       1     1     3
2       3     1     3
3      67     1     3
4       4     3     4
5       5     3     4
6      77     3     4
7      12     5     9
8       6     5     9
  • Related