Home > Enterprise >  How to subset and view all dataframes split from a list
How to subset and view all dataframes split from a list

Time:07-16

I have a list that contains data grouped into world regions. Now I want to extract each region into a data frame and only select certain columns to do more analysis on. I found myself doing a lot of copying and pasting so I think I should write a function to do it for me. Here is my code that didn't work:

lapply(data_list, function(x){
  for (x in 1:length(data_list)) {
    assign(paste0("data_list", x), as.data.frame(data_list[[x]]))} # split the list into data frames
  a = paste0('x', '1') <- subset(data_listx, select = c(1, 4, 5, 17:62))
  View(a)
})

Any thoughts on how I should rewrite it?

CodePudding user response:

You may simply use the subset() function in lapply; it has an argument subset= (must be logical) for the columns and select= for the rows. In your case you only need the latter. Example:

lapply(data_list, subset, subset=hp > 200, select=c(1, 4, 5, 10:11))
# [[1]]
#                      mpg  hp drat gear carb
# Duster 360          14.3 245 3.21    3    4
# Cadillac Fleetwood  10.4 205 2.93    3    4
# Lincoln Continental 10.4 215 3.00    3    4
# Chrysler Imperial   14.7 230 3.23    3    4
# Camaro Z28          13.3 245 3.73    3    4
# Ford Pantera L      15.8 264 4.22    5    4
# Maserati Bora       15.0 335 3.54    5    8
# 
# [[2]]
#                      mpg  hp drat gear carb
# Duster 360          14.3 245 3.21    3    4
# Cadillac Fleetwood  10.4 205 2.93    3    4
# Lincoln Continental 10.4 215 3.00    3    4
# Chrysler Imperial   14.7 230 3.23    3    4
# Camaro Z28          13.3 245 3.73    3    4
# Ford Pantera L      15.8 264 4.22    5    4
# Maserati Bora       15.0 335 3.54    5    8
# 
# [[3]]
#                      mpg  hp drat gear carb
# Duster 360          14.3 245 3.21    3    4
# Cadillac Fleetwood  10.4 205 2.93    3    4
# Lincoln Continental 10.4 215 3.00    3    4
# Chrysler Imperial   14.7 230 3.23    3    4
# Camaro Z28          13.3 245 3.73    3    4
# Ford Pantera L      15.8 264 4.22    5    4
# Maserati Bora       15.0 335 3.54    5    8

Data:

data_list <- replicate(3, mtcars, simplify=FALSE)

CodePudding user response:

Here is an alternative approach using map() from the purrr package. Used the dataset posted by jay.sf

library(dplyr)
library(purrr)
data_list <- replicate(3, mtcars, simplify=FALSE)
regionSubset <- function(df){
  df %>% select(c(1, 4, 5, 10:11)) %>% filter(hp > 200)
}
data_list %>% map(regionSubset)
  •  Tags:  
  • r
  • Related