I have a list that contains data grouped into world regions. Now I want to extract each region into a data frame and only select certain columns to do more analysis on. I found myself doing a lot of copying and pasting so I think I should write a function to do it for me. Here is my code that didn't work:
lapply(data_list, function(x){
for (x in 1:length(data_list)) {
assign(paste0("data_list", x), as.data.frame(data_list[[x]]))} # split the list into data frames
a = paste0('x', '1') <- subset(data_listx, select = c(1, 4, 5, 17:62))
View(a)
})
Any thoughts on how I should rewrite it?
CodePudding user response:
You may simply use the subset()
function in lapply
; it has an argument subset=
(must be logical) for the columns and select=
for the rows. In your case you only need the latter. Example:
lapply(data_list, subset, subset=hp > 200, select=c(1, 4, 5, 10:11))
# [[1]]
# mpg hp drat gear carb
# Duster 360 14.3 245 3.21 3 4
# Cadillac Fleetwood 10.4 205 2.93 3 4
# Lincoln Continental 10.4 215 3.00 3 4
# Chrysler Imperial 14.7 230 3.23 3 4
# Camaro Z28 13.3 245 3.73 3 4
# Ford Pantera L 15.8 264 4.22 5 4
# Maserati Bora 15.0 335 3.54 5 8
#
# [[2]]
# mpg hp drat gear carb
# Duster 360 14.3 245 3.21 3 4
# Cadillac Fleetwood 10.4 205 2.93 3 4
# Lincoln Continental 10.4 215 3.00 3 4
# Chrysler Imperial 14.7 230 3.23 3 4
# Camaro Z28 13.3 245 3.73 3 4
# Ford Pantera L 15.8 264 4.22 5 4
# Maserati Bora 15.0 335 3.54 5 8
#
# [[3]]
# mpg hp drat gear carb
# Duster 360 14.3 245 3.21 3 4
# Cadillac Fleetwood 10.4 205 2.93 3 4
# Lincoln Continental 10.4 215 3.00 3 4
# Chrysler Imperial 14.7 230 3.23 3 4
# Camaro Z28 13.3 245 3.73 3 4
# Ford Pantera L 15.8 264 4.22 5 4
# Maserati Bora 15.0 335 3.54 5 8
Data:
data_list <- replicate(3, mtcars, simplify=FALSE)
CodePudding user response:
Here is an alternative approach using map() from the purrr package. Used the dataset posted by jay.sf
library(dplyr)
library(purrr)
data_list <- replicate(3, mtcars, simplify=FALSE)
regionSubset <- function(df){
df %>% select(c(1, 4, 5, 10:11)) %>% filter(hp > 200)
}
data_list %>% map(regionSubset)