Home > Enterprise >  R: Order variables / columns in a data frame by their attributes
R: Order variables / columns in a data frame by their attributes

Time:08-01

I have a data frame with hundreds of variables. Each variable has a numeric attribute. I would like to order the variables based on their numeric attributes. Imagine the following data frame:

df <- cbind.data.frame(v1=c(1,2,3),v2=c(1,2,3),v3=c(1,2,3))

attributes(df$v1)$myattri<-2
attributes(df$v2)$myattri<-9
attributes(df$v3)$myattri<-1

I would like to order the variables like this: v3 (smallest attribute, 1), v1 (second smallest attribute, 2) , v2 (biggest attribute, 9)

The approach by which you would order variables by name for example does not work:

df[ , order(names(df))]   # orders variables by name
df[ , order(myattri(df))] # Error in myattri(df) : could not find function "myattri"

I can access the attributes on the variable level, but not on the data frame level:

attributes(df$v1) 
  # $myattri
  # [1] 2

attributes(df)
  # $names
  # [1] "v1" "v2" "v3"
  #
  # $row.names
  # [1] 1 2 3
  #
  # $class
  # [1] "data.frame"

CodePudding user response:

One way

> df[,gsub(".myattrib","",names(sort(unlist(sapply(df,attributes)))))]

  v3 v1 v2
1  1  1  1
2  2  2  2
3  3  3  3

CodePudding user response:

As mentioned by @Roland in the comments, you didn't assign attributes to your dataframe, but to the vectors in you dataframe. If you want to assign it to your dataframe you can do this: attributes(df)$myattrib <- c(2,9,1). If you then call attributes(df), you can see that each column has an attribute value. Then you can order you dataframe by order(attributes(df)$myattrib) like this:

df <- cbind.data.frame(v1=c(1,2,3),v2=c(1,2,3),v3=c(1,2,3))

attributes(df)$myattrib <- c(2,9,1)

attributes(df)
#> $names
#> [1] "v1" "v2" "v3"
#> 
#> $class
#> [1] "data.frame"
#> 
#> $row.names
#> [1] 1 2 3
#> 
#> $myattrib
#> [1] 2 9 1

df[ , order(attributes(df)$myattrib)]
#>   v3 v1 v2
#> 1  1  1  1
#> 2  2  2  2
#> 3  3  3  3

Created on 2022-08-01 by the reprex package (v2.0.1)

CodePudding user response:

We can use

df[order(unlist(sapply(df , \(x) attributes(x))))]
  • output
  v3 v1 v2
1  1  1  1
2  2  2  2
3  3  3  3
  •  Tags:  
  • r
  • Related