Mean of differnt combinations in a list (missing some of the combinations)-CodePudding

I have a list with two elements, where I have a variable of interest value for each combination of variables conutry and gender.

# Toy example
  A <- list()
  
  A[[1]] <- data.frame(country=c("Spain", "Spain", "France", "France"),
                       gender= c("M", "F", "M", "F"),
                       value = c(100,125,10,200))

  A[[2]] <- data.frame(country=c("Spain", "Spain", "France"),
                       gender=c("M", "F", "F"),
                       value = c(150,75,100))

Data looks like this:

[[1]]
  country gender value
1   Spain      M   100
2   Spain      F   125
3  France      M    10
4  France      F   200

[[2]]
  country gender value
1   Spain      M   150
2   Spain      F    75
3  France      F   100

I would like to compute the mean of value (across the elements of the list) for each possible combination of gender and country taking into account that no all the combinations are in all the elements of the list (in that example, the second element has not a value for Males in France.

What I expect is something like this:

  country gender value
1   Spain      M   125
2   Spain      F   100
3  France      M     5
4  France      F   150

Any idea to deal with that?

CodePudding user response：

What you need here is complete from {tidyr}.

A %>% 
  map(tidyr::complete, country, 
      gender = c("M", "F"), fill = list(value = 0)) %>% 
  bind_rows() %>% 
  group_by(country, gender) %>% 
  summarise(value = mean(value)) %>% 
  ungroup()

First the map, ensures that each list has all the rows for M and F for gender, and then complete with the present countries (you can also specify the countries that should be present). Then ensures that NA is 0 instead.

CodePudding user response：

A %>%
  bind_rows() %>%
  group_by(country, gender) %>%
  summarise(value = sum(value) / length(A))