Could anyone please help me and explain how can I extract a string from a character vector that contains special characters in it?
I'm working with a vector like this:
txt <- c("{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}",
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}",
"{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}",
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Describes me best\",\"multiplier\":1}",
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
)
> txt
[1] "{\"label\":\"Describes me best\",\"multiplier\":1}" "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
[3] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
[5] "{\"label\":\"Describes me best\",\"multiplier\":1}" "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
[7] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Describes me best\",\"multiplier\":1}"
[9] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
I'd like to extract only the Describes me best
and Somewhat describes me
parts dropping the rest.
I was trying to adapt the str_match()
solution as presented here https://stackoverflow.com/a/39086448/6925293, but probably due to multiple special characters {\"
etc, I can't make it work.
CodePudding user response:
Since these are JSON strings, you can use the jsonStrings package:
library(jsonStrings)
x <- "{\"label\":\"Describes me best\",\"multiplier\":1}"
jstring <- jsonString$new(x)
jstring$at("label")
# "Describes me best"
CodePudding user response:
Is this what you need?
txt <- c("{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}",
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}",
"{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}",
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Describes me best\",\"multiplier\":1}",
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
)
# With gsub you can catch between () a pattern, and get it with \\1
gsub(pattern = '.*"(.*)",.*', replacement = "\\1", x = txt)
#> [1] "Describes me best" "Somewhat describes me" "Somewhat describes me"
#> [4] "Somewhat describes me" "Describes me best" "Somewhat describes me"
#> [7] "Somewhat describes me" "Describes me best" "Somewhat describes me"
#> [10] "Somewhat describes me"
Created on 2022-09-26 with reprex v2.0.2