Home > OS >  Create data.frame in R with multiple list columns (that hold one df each) which saves to json where
Create data.frame in R with multiple list columns (that hold one df each) which saves to json where

Time:07-14

Despite the super long name of the question, here it is:

I have these two dataframes. Each having a single column that is another dataframe

# df1
structure(list(longest_hw = list(structure(list(start = structure(12266, class = "Date"), 
    end = structure(12294, class = "Date"), max_temp_in_hw = 37.5, 
    duration = 28), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-1L)))), row.names = c(NA, -1L), class = c("tbl_df", "tbl", "data.frame"
))

# which looks like this
# A tibble: 1 × 1
  longest_hw      
  <list>          
1 <tibble [1 × 4]>

structure(list(highest_temp = list(structure(list(doy = 220L, 
    cell_id = 68977L, year = "2013", temp = 39.3), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)))), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))

#
# A tibble: 1 × 1
  highest_temp    
  <list>          
1 <tibble [1 × 4]>

I want to "put them together", so that I can write them into one final json, where the column-names (longest_hw and highest_temp) are the keys to a value which is another object. Here the column-names one level below (start, end, max_temp_in_hw, duration) are the keys and the values the values.

So I mean is the following:

# first I bind the two top-level dataframes together using bind_cols
df = bind_cols(longest_hw, highest_temp)

# and then convert it to a class json
df_json = toJSON(df)

# and finally write it out
write(df_json, "path_to_file")

The result looks like this:

[
  {
    "longest_hw": [
      {
        "start": "2003-08-02",
        "end": "2003-08-30",
        "max_temp_in_hw": 37.5,
        "duration": 28
      }
    ],
    "highest_temp": [
      { "doy": 220, "cell_id": 68977, "year": "2013", "temp": 39.3 }
    ]
  }
]

Now the thing is that I want the values of the keys longest_hw and highest_temp to be an object and not an array, so that it would look like this:

[
  {
    "longest_hw": {
      "start": "2003-08-02",
      "end": "2003-08-30",
      "max_temp_in_hw": 37.5,
      "duration": 28
    },
    "highest_temp": {
      "doy": 220,
      "cell_id": 68977,
      "year": "2013",
      "temp": 39.3
    }
  }
]

But I did not find a way to do so. Essentially this is about creating nested json-objects from R, where I know that there are resources, but I simply could not figure it ou

CodePudding user response:

This takes a bit of unwinding:

jsonlite::toJSON(
  list(lapply(c(df1, df2), function(z) lapply(z, as.list)[[1]])),
  pretty = TRUE, auto_unbox = TRUE)
# [
#   {
#     "longest_hw": {
#       "start": "2003-08-02",
#       "end": "2003-08-30",
#       "max_temp_in_hw": 37.5,
#       "duration": 28
#     },
#     "highest_temp": {
#       "doy": 220,
#       "cell_id": 68977,
#       "year": "2013",
#       "temp": 39.3
#     }
#   }
# ] 

My method: I took your desired output json, ran it through parse_json, and found a combination of c, as.list, and such in order to mimic the structure.

  • Related