extract values from unique keys in R from a json file-CodePudding

I am trying to convert a multiple JSON files in CSVs. All the files have this kind of structure:

{
  "id": "ob6",
  "class": "objective",
  "type": "objective",
  "content": {
    "title": "s13",
    "image": null,
    "image_svg": "https://s.svg",
    "images": {
      "thumbnail_256": "catsobjective_nl_complete_a1_26_256.jpg"
    },
    "description": "737a1e0",
    "color_1": "FDE0BA",
    "color_2": "FF9240",
    "bucket": 2
  },
  "structure": [ a lot of things.
  ],
  "translation_map": {
    "str_empty": {
      "nl": {},
      "en": {}
    },
    "str_eca97dcb": {
      "nl": {
        "value": "rkt",
        "audio": "lalasqw808007.mp3"
      },
      "en": {
        "value": "suqwet"
      }
    },
    "str_7679qwc524": {
      "nl": {
        "value": "Martijn koopt groenten in de supermarkt.",
        "audio": "lalastqw8026.mp3"
      },
      "en": {
        "value": "Marqwearket."
      }
    },
    "str_6qjkef9236": {
      "nl": {
        "value": "kj",
        "audio": "kjiuygh.mp3"
      },
      "en": {
        "value": "tewewn"
      }
    },
    "str_9b4187uij-d256a1b09f4a": {
      "nl": {
        "value": "Majhghrkt.",
        "audio": "lalastu1609402321.mp3"
      },
      "en": {
        "value": "Markjhgket."
      }
    },
    "str_e817lkiuyg1ea0": {
      "nl": {
        "value": "Mellhgarkt[/k]. ",
        "audio": "lalastuhjg040.mp3"
      },
      "en": {
        "value": "Melliouyghmarket."
      }
    },
    "str_e44fbkiuyghv4e8": {
      "nl": {
        "value": "iuhjbn"
      },
      "en": {
        "value": ""
      }
    },
    "str_a4d72lkuiuhgfff": {
      "nl": {
        "value": "uiyghbn"
      },
      "en": {
        "value": ""
      }
    },
    "str_5ed8kjuiyghvb79917": {
      "nl": {
        "value": "kjhvbnm,",
        "audio": "lalastukjghv852.mp3"
      },
      "en": {
        "value": "oiuhn ew"
      }
    },
    "str_ded3d8e3-f53loiuyghf": {
      "nl": {
        "value": "Doiuyguro.",
        "audio": "lalastu1liouhj6634.mp3"
      },
      "en": {
        "value": "Touihjbnro."
      }
    }
    }
  }
}

from translation_map, I need to get the nl value, nl audio and en value as columns in a dataframe.

So far I have this code written:

library(jsonlite)
library(tidyverse)


files <- list.files(path=".", pattern=".json", all.files=FALSE,
           full.names=FALSE)

data <- fromJSON(files[1])

I tried to use fromJSON(files[1], flatten = TRUE) but that didn't work.

The unique values for the keys throw me off here. Also I am not sure what to do about the

"str_empty": {
      "nl": {},
      "en": {}

part of the json file.

CodePudding user response：

The JSON you supplied does not seem to be valid. I found two problems:

"structure": [ a lot of things. ],

This throws an error, as the square brackets are problematic here. If you exchange these with quotation marks it is fine.

Also you seem to have one curly bracket } too much at the end. This might be the reason why you can't read the file and end up with an error message

The empty key-value pairs are in fact not a problem - they will be translated by jsonlite as empty named list.

You can then try indexing your resulting data.frame via

dat$translation_map[[1]]

dat$translation_map[[2]][[1]]

and with a few loops you should be able to pull out the values that are of interest for you.

You could also try to flatten the resulting nested list into a named vector and use grep on its names to find desired values:

library(jsonlite)
dat<-fromJSON("your correct file.json")

dat2<-unlist(dat)
indices<-grep("nl.value",names(dat2))
dat2[indices]

(But maybe there is a more elegant way...)

CodePudding user response：

The elements in you JSON have different length: 1, 8 or 10.

test <- fromJSON("C:/GP/trash/test.json",flatten = T)
> summary(test)
                Length Class  Mode     
id               1     -none- character
class            1     -none- character
type             1     -none- character
content          8     -none- list     
structure        1     -none- character
translation_map 10     -none- list

You can convert the elements with the same length as a dataframe:

test[c('id', 'class', 'type', "structure")] %>% as.data.frame()
   id     class      type        structure
1 ob6 objective objective a lot of things.

But should decide what to do with the other ones.