I have a vector called chapt1
. I want to reorganize this vector into a data frame, df1
such that;
- The integers(the ones in parenthesis) beginning each verse is printed in the first column of
df1
- The adjoining text is printed in the next column on the same row.
> chapt1 <- ("1. The Grand Opening (1) The black cat jumped over the lazy rabbit. (2) Salt has no taste (3) The grandmaster mentors his disciples (4) Generation of miracles. (5) Are we there yet: opening the first stage in the dungeon.")
Result;
1 The black cat jumped over the lazy rabbit.
2 Salt has no taste
3 The grandmaster mentors his disciples.
4 Generation of miracles.
5 Are we there yet: opening the first stage in the dungeon.```
Note: This is just a portion of the original file.
CodePudding user response:
We may use
library(stringr)
trimws(str_remove(str_extract_all(chapt1, "\\(\\d \\)[^.\\(] ")[[1]], "^\\(\\d \\)\\s "))
CodePudding user response:
Here is another tidyverse approach:
We separate the rows with regex '\\(\\d\\)'
, remove the first row, filter and use str_squish
to remove spaces at beginning and end:
library(tidyverse)
as_tibble(chapt1) %>%
separate_rows(value, sep='\\(\\d\\)') %>%
filter(row_number() > 1) %>%
mutate(value = str_squish(value))
value
<chr>
1 The black cat jumped over the lazy rabbit.
2 Salt has no taste
3 The grandmaster mentors his disciples
4 Generation of miracles.
5 Are we there yet: opening the first stage in the dungeon.