Home > Software engineering >  R - Subsetting a deeply nested list by parameters that are on the lower levels
R - Subsetting a deeply nested list by parameters that are on the lower levels

Time:09-27

Hi I have a deeply nested list obtained from a JSON file with a structure like below in which only one participant from several measurements is displayed

$ :List of 11
..$ participant    :List of 4
.. ..$ code  : chr "c7caf74b-8665-4c73-b610-70ccb3f2ec0c"
.. ..$ gender: chr "MALE"
.. ..$ height: num 0
.. ..$ age   : num 56
..$ startTime      : chr "2020-07-31T10:30:36.145Z"
..$ title          : chr "exercise_template_builtin_sens_knee_title"
..$ numberOfSets   : num 2
..$ numberOfReps   : num 7
..$ duration       : num 6
..$ rest           : num 5
..$ preparationTime: num 3
..$ activityType   : chr "EVALUATION"
..$ exerciseType   : chr "METER"
..$ repetitions    :List of 2
.. ..$ :List of 1
.. .. ..$ measurements:List of 1
.. .. .. ..$ :List of 4
.. .. .. .. ..$ bodyPart: chr "LEG"
.. .. .. .. ..$ side    : chr "LEFT"
.. .. .. .. ..$ device  : chr "SENS"
.. .. .. .. ..$ data    :List of 5
.. .. .. .. .. ..$ deltas        : num [1:449] 1.6e 12 1.6e 12 1.6e 12 1.6e 12 1.6e 12 ...
.. .. .. .. .. ..$ channel1Forces: num [1:449] 7.27 7.6 7.88 8.55 8.84 ...
.. .. .. .. .. ..$ channel2Forces: list()
.. .. .. .. .. ..$ channel3Forces: list()
.. .. .. .. .. ..$ channel4Forces: list()
.. ..$ :List of 1
.. .. ..$ measurements:List of 1
.. .. .. ..$ :List of 4
.. .. .. .. ..$ bodyPart: chr "LEG"
.. .. .. .. ..$ side    : chr "RIGHT"
.. .. .. .. ..$ device  : chr "SENS"
.. .. .. .. ..$ data    :List of 5
.. .. .. .. .. ..$ deltas        : num [1:452] 1.6e 12 1.6e 12 1.6e 12 1.6e 12 1.6e 12 ...
.. .. .. .. .. ..$ channel1Forces: num [1:452] 6.86 7.35 7.59 7.87 8.35 ...
.. .. .. .. .. ..$ channel2Forces: list()
.. .. .. .. .. ..$ channel3Forces: list()
.. .. .. .. .. ..$ channel4Forces: list()

I want to extract all elements and data of that list based on the $device element,

Any assistance would be very helpful — all my attempts until now mostly are with no success and very much stuck.

Thanks

CodePudding user response:

For something like this, I think it is easiest to use base R methods rather than getting it to work in tidyverse. You'll probably end up with longer code, but it will be easier to understand and easier to get right.

First you should write a function to determine if a particular element should be included or not. For example,

includeDevice <- function(x, rep, device)
  x$repetitions[[rep]]$measurements[[1]]$device == device

Also write a function to extract a single record in the format you want, e.g.

oneRecord <- function(x, rep)
  data.frame(x$repetitions[[rep]]$measurements[[1]]$data$channel1forces)

Then you should loop over the whole structure, and build up the result you want:

dev <- "SENS"
result <- list()

for (participant = 1:11) {
  x <- dataset[[participant]]
  for (rep = 1:length(x$repetitions)) {
    if (includeDevice(x, rep, dev)) {
       result <- c(result, list(oneRecord(x, rep)))
    }
  }
}

The advantage of this approach is that you can test and modify the two functions easily, and you can single step through the code to see if things are working as you want.

CodePudding user response:

It is somewhat unclear to me which output format is expected, but here are several approaches using rrapply() in the rrapply-package (extension of base rapply()).

  • Bind the measurements elements together as single rows in a data.frame:
library(rrapply)

## bind measurements to wide data.frame
rrapply(
  dummy_data, 
  how = "bind",
  options = list(coldepth = 5)
)
#>   bodyPart  side device                                 data.deltas data.channel1Forces
#> 1      LEG  LEFT   SENS 1.6e 12, 1.6e 12, 1.6e 12, 1.6e 12, 1.6e 12       8, 8, 8, 8, 8
#> 2      LEG RIGHT   SENS 1.6e 12, 1.6e 12, 1.6e 12, 1.6e 12, 1.6e 12       7, 7, 7, 7, 7

If necessary, we can further unnest columns using e.g. tidyr::unnest_longer().

  • Melt the measurements elements into a long data.frame containing all node paths:
## melt measurements to long data.frame
rrapply(
  dummy_data, 
  condition = \(x, .xparents) "measurements" %in% .xparents, 
  how = "melt"
)
#>             L1 L2           L3 L4       L5             L6                                       value
#> 1  repetitions  1 measurements  1 bodyPart           <NA>                                         LEG
#> 2  repetitions  1 measurements  1     side           <NA>                                        LEFT
#> 3  repetitions  1 measurements  1   device           <NA>                                        SENS
#> 4  repetitions  1 measurements  1     data         deltas 1.6e 12, 1.6e 12, 1.6e 12, 1.6e 12, 1.6e 12
#> 5  repetitions  1 measurements  1     data channel1Forces                               8, 8, 8, 8, 8
#> 6  repetitions  2 measurements  1 bodyPart           <NA>                                         LEG
#> 7  repetitions  2 measurements  1     side           <NA>                                       RIGHT
#> 8  repetitions  2 measurements  1   device           <NA>                                        SENS
#> 9  repetitions  2 measurements  1     data         deltas 1.6e 12, 1.6e 12, 1.6e 12, 1.6e 12, 1.6e 12
#> 10 repetitions  2 measurements  1     data channel1Forces                               7, 7, 7, 7, 7

We can then reshape to a more convenient format using e.g. tidyr::pivot_wider(), but this seems less useful than the first format.


Data

## dummy nested list
dummy_data <- list(
  participant = list(
    code = "c7caf74b-8665-4c73-b610-70ccb3f2ec0c",
    gender = "MALE",
    height = 0,
    age = 58
  ),
  startTime = "2020-07-31T10:30:36.145Z",
  title = "exercise_template_builtin_sens_knee_title",
  numberOfSets = 2,
  numberOfReps = 7,
  duration = 6,
  rest = 5,
  preparationTime =  3,
  activityType = "EVALUATION",
  exerciseType = "METER",
  repetitions = list(
    list(
      measurements = list(
        list(
          bodyPart = "LEG",
          side = "LEFT",
          device = "SENS",
          data = list(
            deltas = rep(1.6e12, 5),
            channel1Forces = rep(8, 5),
            channel2Forces = list(),
            channel3Forces = list(),
            channel4Forces = list()
          )
        )
      )
    ),
    list(
      measurements = list(
        list(
          bodyPart = "LEG",
          side = "RIGHT",
          device = "SENS",
          data = list(
            deltas = rep(1.6e12, 5),
            channel1Forces = rep(7, 5),
            channel2Forces = list(),
            channel3Forces = list(),
            channel4Forces = list()
          )
        )
      )
    )
  )
)
  • Related