I am trying to split the data frame into list.
My data frame is as follows:
I want to split this dataframe into lists as follows list
[1] Country Iso_code Year cow Supdem (the forth column in the picture)
[[2]] Country Iso_code Year cow Supdem (the fifth colum in the picture)
[[3]] Country Iso_code Year cow Supdem (the sixth colum in the picture)
....
Could you help me with this?
CodePudding user response:
load package
require(data.table)
since we had a jpeg as sample dataframe (df), we shall create one as our example
df <- data.table(country = letters[1:3]
, iso_code = 1:3
, year = c(2000, 2001, 2002)
, cow = 1:3
, supdem = 1:3
, supdem = 4:6
); df
country iso_code year cow supdem supdem
1: a 1 2000 1 1 4
2: b 2 2001 2 2 5
3: c 3 2002 3 3 6
run this if your dataframe is not a data table already
setDT(df)
Not the best practice to give our columns same names i.e. supdem
. As this entails identifying columns using their index (which is also bad practice as future column location may change). Anyway, we first identify column location of columns: supdem
x <- 5:ncol(df)
Then iteratively put the first 4 columns then 1 variable (changing) column into lists
y <- lapply(x, function(i) df[, .SD, .SDcols=c(1:4, i)]); y
[[1]]
country iso_code year cow supdem
1: a 1 2000 1 1
2: b 2 2001 2 2
3: c 3 2002 3 3
[[2]]
country iso_code year cow supdem
1: a 1 2000 1 4
2: b 2 2001 2 5
3: c 3 2002 3 6
CodePudding user response:
One way could be to use Map
to bind the columns between your 4 key columns to each other column using split.default
, which will create individual dataframes.
Map(cbind, list(mtcars[1:4]), split.default(mtcars[-(1:4)], 1:7))
Another way could be:
library(tidyverse)
purrr::map(names(mtcars[-c(1:4)]), function(x) bind_cols(mtcars[c(1:4)], mtcars[x]))
This one could also be done with the apply
family too.
lapply(names(mtcars[-c(1:4)]), function(x) cbind(mtcars[c(1:4)], mtcars[x]))
Another way would be to create two lists of dataframes, one with the main information replicated n
times (depending on number of columns not repeated) and another with the individual columns (excluding your key columns). First, I determined the number of columns that you have minus the number of columns that would be repeated in each dataframe (i.e., 4). Then, I replicate the dataframe with the key columns (i.e., mtcars[, 1:4]
). Then, I use split.default
to put each remaining column in its own dataframe in a list. Finally, I use purrr::map2
to bind the columns from the two dataframes for each index position.
library(purrr)
nocols <- ncol(mtcars) - 4
lst1 <- replicate(nocols, mtcars[, 1:4], simplify = FALSE)
lst2 <-
split.default(mtcars[, 5:ncol(mtcars)], seq_along(mtcars[, 5:ncol(mtcars)]))
results <- map2(lst1, lst2, bind_cols)
Output
str(results)
List of 7
$ :'data.frame': 32 obs. of 5 variables:
..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
..$ disp: num [1:32] 160 160 108 258 360 ...
..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
$ :'data.frame': 32 obs. of 5 variables:
..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
..$ disp: num [1:32] 160 160 108 258 360 ...
..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
..$ wt : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
$ :'data.frame': 32 obs. of 5 variables:
..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
..$ disp: num [1:32] 160 160 108 258 360 ...
..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
$ :'data.frame': 32 obs. of 5 variables:
..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
..$ disp: num [1:32] 160 160 108 258 360 ...
..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
..$ vs : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
$ :'data.frame': 32 obs. of 5 variables:
..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
..$ disp: num [1:32] 160 160 108 258 360 ...
..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
$ :'data.frame': 32 obs. of 5 variables:
..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
..$ disp: num [1:32] 160 160 108 258 360 ...
..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
$ :'data.frame': 32 obs. of 5 variables:
..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
..$ disp: num [1:32] 160 160 108 258 360 ...
..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...