I have long inputs that i've imported with readChar which look like this "[1] "1100?001?01??110?10?????101011?111????11?1?1????????1??01?101??01?????1??????1??111??0?1?11?1110?".
I want to separate each string by digit: "1, 1, 0, 0, ?, 0, 0, 1..." and turn each of these lists into columns of a dataframe so that I can compare input strings easily.
I've tried separating the digits using strsplit() but because they aren't comma separated I can't seem to do much with them.
I've tried turning the strsplit output into a list. This added the whole string to every row of my dataframe instead of separate digits in order on each row.
I can't figure this out. Please help.
CodePudding user response:
Something like this adds one character per row:
library(tidyverse)
as_tibble_col(unlist(str_split("1100?001?01??110?10?????101011?111????11?1?1????????1??01?101??01?????1??????1??111??0?1?11?1110?", "")), "col1")
# # A tibble: 97 × 1
# col1
# <chr>
# 1 1
# 2 1
# 3 0
# 4 0
# 5 ?
# 6 0
# 7 0
# 8 1
# 9 ?
# 10 0
# # … with 87 more rows
# # ℹ Use `print(n = ...)` to see more rows
CodePudding user response:
Here we add a space after each character to better separate the rows within as_tibble:
library(dplyr)
library(tidyr)
x <- "1100?001?01??110?10?????101011?111????11?1?1????????1??01?101??01?????1??????1??111??0?1?11?1110?"
y <- sub("\\s $", "", gsub('(.{1})', '\\1 ', x))
y %>%
as_tibble() %>%
separate_rows("value")
value
<chr>
1 1
2 1
3 0
4 0
5 0
6 0
7 1
8 0
9 1
10 1
# ... with 39 more rows