Home > Software design >  How to mutate a list of dataframes simultaneously in R
How to mutate a list of dataframes simultaneously in R

Time:05-02

I am trying to mutate a dataframes which are part of a list of dataframe all at the same time in R

Here are the functions I am running on the dataframe, this is able to mutate/group_by/summarise

ebird_tod_1 <- ebird_split[[1]] %>% #ebird_split is the df list.
  mutate(tod_bins = cut(time_observations_started, 
                        breaks = breaks, 
                        labels = labels,
                        include.lowest = TRUE),
         tod_bins = as.numeric(as.character(tod_bins))) %>% 
  group_by(tod_bins) %>% 
  summarise(n_checklists = n(),
            n_detected = sum(species_observed),
            det_freq = mean(species_observed))

This works superb for one dataframe in the list, however I have 45,And I rather not have pages of this coding to create the 45 variable. Hence I am lookingg for a method that would increase the "ebird_tod_1" variable to "ebird_tod_2" "ebird_tod_3" etc. At the same time that the dataframe on which the modification occur should change to "ebird_split[[2]]" "ebird_split[[3]]".

I have tried unsuccessfully to use the repeat and map function.

I hope that is all the info someone need to help, I am new at R,

Thank you.

CodePudding user response:

As you provided no example data the following code is not tested. But a general approach would be to put your code inside a function and to use lapply or purrr::map to loop over your list of data frames and store the result in a list (instead of creating multiple objects):

myfun <- function(x) {
  x %>%
    mutate(tod_bins = cut(time_observations_started, 
                          breaks = breaks, 
                          labels = labels,
                          include.lowest = TRUE),
           tod_bins = as.numeric(as.character(tod_bins))) %>% 
    group_by(tod_bins) %>% 
    summarise(n_checklists = n(),
              n_detected = sum(species_observed),
              det_freq = mean(species_observed))
  
}
ebird_tod <- lapply(ebird_split, myfun)

CodePudding user response:

In your example it seems like you want to create data.frames in the global environment from that list of data.frames. To do this we could use rlang::env_bind:

library(tidyverse)

# a list of data.frames
data_ls <- iris %>% 
  nest_by(Species) %>% 
  pull(data)
  
# name the list of data frames
data_ls <- set_names(data_ls, paste("iris", seq_along(data_ls), sep = "_"))

data_ls %>% 
  # use map or lapply to make some operations
  map(~ mutate(.x, new = Sepal.Length   Sepal.Width) %>% 
        summarise(across(everything(), mean),
                  n = n())) %>% 
  # pipe into env_bind and splice list of data.frames
  rlang::env_bind(.GlobalEnv, !!! .)

Created on 2022-05-02 by the reprex package (v2.0.1)

  • Related