Home > Back-end >  extract bold and italic text from a text document
extract bold and italic text from a text document

Time:11-24

I have text files and I am highlighting certain text in bold and italics. I would like a script which reads the .txt file and exports all text which is bold or italics into another document (text file).

Anyone know a way?

Preferably R solution, but can try other solutions.

Mac user

CodePudding user response:

Suppose we have a markdown formatted text file ìn.md and we want to create another markdown file out.md containing only italic and bold sections.

Content of file in.md:

# Header

There is *italic* and **bold** text!
There is *another italic* and **another bold** text!
library(tidyverse)

text <- read_file("in.md")
bold_texts <- text %>%
  str_extract_all("\\*\\*[^\\*] \\*\\*") %>%
  purrr::simplify() %>%
  map_chr(~ .x %>% str_remove_all("\\*"))
bold_texts
#> [1] "bold"         "another bold"
italic_texts <-
  text %>%
  str_remove_all(bold_texts %>% map_chr(~ paste0("\\*\\*", .x, "\\*\\*")) %>% paste0(collapse = "|")) %>%
  str_extract_all("\\*[^\\*] \\*") %>%
  purrr::simplify() %>%
  map_chr(~ .x %>% str_remove_all("\\*"))
italic_texts
#> [1] "italic"         "another italic"

out_text <- c("#Bold texts:", bold_texts, "#Italic texts:", italic_texts) %>% paste0(collapse = "\n")
cat(out_text)
#> #Bold texts:
#> bold
#> another bold
#> #Italic texts:
#> italic
#> another italic
write_file(out_text, "out.md")

Created on 2021-11-23 by the reprex package (v2.0.1)

  • Related