Home > database >  How to shuffle columns by every another in R?
How to shuffle columns by every another in R?

Time:08-01

Suppose I have a data.frame like:

set.seed(123)
df <- data.frame(a=rnorm(10, 0,1), b=rnorm(10,1,2), c=rnorm(10, 2, 1), 
x=rnorm(10, 1,2), y=rnorm(10,2,3), z=rnorm(10, 3, 4))
#             a           b         c         x           y         z
#1  -0.56047565  3.44816359 0.9321763 1.8529284 -0.08412094  4.013274
#2  -0.23017749  1.71962765 1.7820251 0.4098570  1.37624817  2.885813
#3   1.55870831  1.80154290 0.9739956 2.7902513 -1.79618905  2.828518
#4   0.07050839  1.22136543 1.2711088 2.7562670  8.50686790  8.474409
#5   0.12928774 -0.11168227 1.3749607 2.6431622  5.62388599  2.096916
#6   1.71506499  4.57382627 0.3133067 2.3772805 -1.36932575  9.065882
#7   0.46091621  1.99570096 2.8377870 2.1078353  0.79134549 -3.195011
#8  -1.26506123 -2.93323431 2.1533731 0.8761766  0.60003394  5.338455
#9  -0.68685285  2.40271180 0.8618631 0.3880747  4.33989536  3.495417
#10 -0.44566197  0.05441718 3.2538149 0.2390580  1.74989280  3.863766

My question is how to reorder the columns to get:

#             a         x           b           y         c         z
#1  -0.56047565 1.8529284  3.44816359 -0.08412094 0.9321763  4.013274
#2  -0.23017749 0.4098570  1.71962765  1.37624817 1.7820251  2.885813
#3   1.55870831 2.7902513  1.80154290 -1.79618905 0.9739956  2.828518
#4   0.07050839 2.7562670  1.22136543  8.50686790 1.2711088  8.474409
#5   0.12928774 2.6431622 -0.11168227  5.62388599 1.3749607  2.096916
#6   1.71506499 2.3772805  4.57382627 -1.36932575 0.3133067  9.065882
#7   0.46091621 2.1078353  1.99570096  0.79134549 2.8377870 -3.195011
#8  -1.26506123 0.8761766 -2.93323431  0.60003394 2.1533731  5.338455
#9  -0.68685285 0.3880747  2.40271180  4.33989536 0.8618631  3.495417
#10 -0.44566197 0.2390580  0.05441718  1.74989280 3.2538149  3.863766

CodePudding user response:

Using modulo (%%)

d2 = df[ , order((seq_along(df) - 1) %% (ncol(df) / 2))]

names(d2)
# [1] "a" "x" "b" "y" "c" "z"

To make it work with both even and odd number of columns, use ceiling in the divisor:

df_odd = df[-6]

d2 = df_odd[ , order((seq_along(df_odd) - 1) %% ceiling(ncol(df) / 2))]

names(d2)
# [1] "a" "x" "b" "y" "c"

Because OP mentioned that they have over 100 columns it may be relevant to considerdata.table::setcolorder, which reorders the columns without copying the data:

library(data.table)
setDT(df)
setcolorder(df, order((seq_along(df) - 1) %% ceiling(ncol(df) / 2)))

CodePudding user response:

We can generate the key shuffle index as follows. Note how it handles odd/even n at the same time without an if().

ShufInd <- function (n) {
  m <- ceiling(n / 2)
  sequence(rep(2, each = m), seq_len(m), m)[1:n]
}

ShufInd(6)
#[1] 1 4 2 5 3 6

ShufInd(5)
#[1] 1 4 2 5 3

To shuffle a vector (atomic or list) or a data frame of length n:

## OP's data frame
df[ShufInd(length(df))]

## drop the last column they try again
df <- df[-6]
df[ShufInd(length(df))]

## an atomic vector
x <- letters[1:5]
x[ShufInd(length(x))]

## a list
x <- as.list(x)
x[ShufInd(length(x))]

To shuffle columns of a matrix:

mat <- matrix(1:10, 2, 5)
mat[, ShufInd(ncol(mat))]

This supersedes my initial answer that treats odd and even n separately.


Henrik's answer is also a unified approach that can be written as:

Henrik <- function (n) order(seq(0, n - 1) %% ceiling(n / 2))

This is impressively concise!

CodePudding user response:

A possible solution in base R:

df <- df[c(t(matrix(names(df), ncol = 2)))]
df

#>              a         x           b           y         c         z
#> 1  -0.56047565 1.8529284  3.44816359 -0.08412094 0.9321763  4.013274
#> 2  -0.23017749 0.4098570  1.71962765  1.37624817 1.7820251  2.885813
#> 3   1.55870831 2.7902513  1.80154290 -1.79618905 0.9739956  2.828518
#> 4   0.07050839 2.7562670  1.22136543  8.50686790 1.2711088  8.474409
#> 5   0.12928774 2.6431622 -0.11168227  5.62388599 1.3749607  2.096916
#> 6   1.71506499 2.3772805  4.57382627 -1.36932575 0.3133067  9.065882
#> 7   0.46091621 2.1078353  1.99570096  0.79134549 2.8377870 -3.195011
#> 8  -1.26506123 0.8761766 -2.93323431  0.60003394 2.1533731  5.338455
#> 9  -0.68685285 0.3880747  2.40271180  4.33989536 0.8618631  3.495417
#> 10 -0.44566197 0.2390580  0.05441718  1.74989280 3.2538149  3.863766

Using dplyr and the same idea:

library(dplyr)

df %>% 
  relocate(as.vector(t(matrix(names(df), ncol = 2))))

#>              a         x           b           y         c         z
#> 1  -0.56047565 1.8529284  3.44816359 -0.08412094 0.9321763  4.013274
#> 2  -0.23017749 0.4098570  1.71962765  1.37624817 1.7820251  2.885813
#> 3   1.55870831 2.7902513  1.80154290 -1.79618905 0.9739956  2.828518
#> 4   0.07050839 2.7562670  1.22136543  8.50686790 1.2711088  8.474409
#> 5   0.12928774 2.6431622 -0.11168227  5.62388599 1.3749607  2.096916
#> 6   1.71506499 2.3772805  4.57382627 -1.36932575 0.3133067  9.065882
#> 7   0.46091621 2.1078353  1.99570096  0.79134549 2.8377870 -3.195011
#> 8  -1.26506123 0.8761766 -2.93323431  0.60003394 2.1533731  5.338455
#> 9  -0.68685285 0.3880747  2.40271180  4.33989536 0.8618631  3.495417
#> 10 -0.44566197 0.2390580  0.05441718  1.74989280 3.2538149  3.863766

CodePudding user response:

You can use dplyr for this purpose as given below,

> library(dplyr)
> df
     player position points rebounds
1      a        G     12        5
2      b        F     15        7  
3      c        F     19        7
4      d        G     22       12
5      e        G     32       11
> df %>% select(rebounds, position, points, player)
    rebounds position points player
 1        5        G     12      a
 2        7        F     15      b
 3        7        F     19      c
 4       12        G     22      d
 5       11        G     32      e
  • Related