Home > Enterprise >  Sorting specific columns of a dataframe by their names in R
Sorting specific columns of a dataframe by their names in R

Time:12-25

df is a test dataframe and I need to sort the last three columns in ascending order (without hardcoding the order).

df <- data.frame(X = c(1, 2, 3, 4, 5),
            Z = c(1, 2, 3, 4, 5),
            Y = c(1, 2, 3, 4, 5),
            A = c(1, 2, 3, 4, 5),
            C = c(1, 2, 3, 4, 5),
            B = c(1, 2, 3, 4, 5))

Desired output:

> df
  X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

I'm aware of the order() function but I can't seem to find the right way to implement it to get the desired output.

CodePudding user response:

Update:

Base R:

cbind(df[1:3],df[4:6][,order(colnames(df[4:6]))])

First answer:

We could use relocate from dplyr: https://dplyr.tidyverse.org/reference/relocate.html

It is configured to arrange columns:

Here we relocate by the index. We take last (index = 6) and put it before (position 5, which is C)

library(dplyr)
df %>% 
  relocate(6, .before = 5)

An alternative:

library(dplyr)
df %>% 
  select(order(colnames(df))) %>% 
  relocate(4:6, .before = 1)
X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

CodePudding user response:

In base R, a selection on the first columns then sort the last 3 names :

df[, c(names(df)[1:(ncol(df)-3)], sort(names(df)[ncol(df)-2:0]))]

CodePudding user response:

We want to reorder the columns based on the column names, so if we use names(df) as the argument to order, we can reorder the data frame as follows.

The complicating factor is that order() returns a vector of numbers, so if we want to reorder only a subset of the column names, we'll need an approach that retains the original sort order for the first three columns.

We accomplish this by creating a vector of the first 3 column names, the sorted remaining column names using a function that returns the values rather than locations in the vector, and then use this with the [ form of the extract operator.

df <- data.frame(X = c(1, 2, 3, 4, 5),
                 Z = c(1, 2, 3, 4, 5),
                 Y = c(1, 2, 3, 4, 5),
                 A = c(1, 2, 3, 4, 5),
                 C = c(1, 2, 3, 4, 5),
                 B = c(1, 2, 3, 4, 5))

df[,c(names(df[1:3]),sort(names(df[4:6])))]

...and the output:

> df[,c(names(df[1:3]),sort(names(df[4:6])))]
  X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

CodePudding user response:

df <- data.frame(X = c(1, 2, 3, 4, 5),
            Z = c(1, 2, 3, 4, 5),
            Y = c(1, 2, 3, 4, 5),
            A = c(1, 2, 3, 4, 5),
            C = c(1, 2, 3, 4, 5),
            B = c(1, 2, 3, 4, 5))

to_order <- seq(ncol(df)) > ncol(df) - 3

df[order(to_order*order(names(df)))]
#>   X Z Y A B C
#> 1 1 1 1 1 1 1
#> 2 2 2 2 2 2 2
#> 3 3 3 3 3 3 3
#> 4 4 4 4 4 4 4
#> 5 5 5 5 5 5 5

Created on 2021-12-24 by the reprex package (v2.0.1)

  • Related