Home > Back-end >  Extract string from JSON parsed character vector
Extract string from JSON parsed character vector

Time:09-26

Could anyone please help me and explain how can I extract a string from a character vector that contains special characters in it?

I'm working with a vector like this:

txt <- c("{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
"{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Describes me best\",\"multiplier\":1}", 
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
)

> txt
 [1] "{\"label\":\"Describes me best\",\"multiplier\":1}"       "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
 [3] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
 [5] "{\"label\":\"Describes me best\",\"multiplier\":1}"       "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
 [7] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Describes me best\",\"multiplier\":1}"      
 [9] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"

I'd like to extract only the Describes me best and Somewhat describes me parts dropping the rest.

I was trying to adapt the str_match() solution as presented here https://stackoverflow.com/a/39086448/6925293, but probably due to multiple special characters {\" etc, I can't make it work.

CodePudding user response:

Since these are JSON strings, you can use the jsonStrings package:

library(jsonStrings)

x <- "{\"label\":\"Describes me best\",\"multiplier\":1}"
jstring <- jsonString$new(x)
jstring$at("label")
# "Describes me best"

CodePudding user response:

Is this what you need?

  txt <- c("{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
           "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
           "{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
           "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Describes me best\",\"multiplier\":1}", 
           "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
  )
  
  
# With gsub you can catch between () a pattern, and get it with \\1
  gsub(pattern = '.*"(.*)",.*', replacement = "\\1", x = txt)

#>  [1] "Describes me best"     "Somewhat describes me" "Somewhat describes me"
#>  [4] "Somewhat describes me" "Describes me best"     "Somewhat describes me"
#>  [7] "Somewhat describes me" "Describes me best"     "Somewhat describes me"
#> [10] "Somewhat describes me"

Created on 2022-09-26 with reprex v2.0.2

  • Related