Home > Software engineering >  Str_extract_all where replacement is keeping value between two character strings
Str_extract_all where replacement is keeping value between two character strings

Time:08-31

First time posting to SO so apologies in advance, I am also quite new to R so this may not be possible.

I am trying to format a string, (extracted from a JSON) where there is a value between 2 curly braces like so

{@link value1}

I am trying to replace the {@link value1} with [[value1]] so that it will work as a link in my markdown file.

I cannot just replace the opening and then the closing as there is also {@b value2} which would be formatted to **value2**

I have cobbled together a str_replace that functions if there is only 1 link replacement needed in a string but I am running into an issue when there is two. Like so:

str <- c("This is the first {@link value1} and this is the second {@link value2}")

The actual potential strings are much more varied than this

My plan was to build a function to take input as to the type of pattern needed either bold or link and then paste the strings with the extracted value in the middle to form the replacement

However that has either left me with

This is the first [[ value1 ]][[ value2 ]] and this is the second [[ value1 ]][[ value2 ]]
or
This is the first [[ value1 ]] and this is the second [[ value1 ]]

Is there a more glamorous way of achieving this without searching from where the last } was replaced?

I was looking at the example of the documentation of stringr for str_replace and it uses an example of a function at the bottom but I can't de-code it to try using for my example

What I'm using to extract the value incl the curly braces

str_extract_all(str,"(\\{@link ). ?(\\})")
[[1]]
[1] "{@link value1}" "{@link value2}"

What I'm using to extract the value excl the curly braces and tag

str_extract_all(str,"(?<=\\{@link ). ?(?=\\})")
[[1]]
[1] "value1" "value2"

CodePudding user response:

You could use str_replace_all() to perform multiple replacements by passing a named vector (c(pattern1 = replacement1)) to it. References of the form \\1, \\2, etc. will be replaced with the contents of the respective matched group created by ().

str <- c("This is the first {@link value1} and this is the second {@b value2}")

str_replace_all(str, c("\\{@link\\s (. ?)\\}" = "[[\\1]]",
                       "\\{@b\\s (. ?)\\}"    = "**\\1**"))

# [1] "This is the first [[value1]] and this is the second **value2**"
  • Related