Home > database >  Sort each column in a data frame and get common in the top 50 percentage
Sort each column in a data frame and get common in the top 50 percentage

Time:08-06

I have a file format like this

data<-mtcars[,c("mpg","drat","wt")]
data<- tibble::rownames_to_column(data, "Names")

I need to sort (Ascending) all the columns and filter to the top 50 %. Then from this data, I need to get the common one. For this purpose, I am using the following code

mpg<-data %>% dplyr::select(Names,mpg)%>%mutate_if(is.numeric, round,digits=3) %>% arrange(desc(mpg))%>% filter(row_number() / n() <= .5)
drat<-data %>% dplyr::select(Names,drat)%>%mutate_if(is.numeric, round,digits=3) %>% arrange(desc(drat))%>% filter(row_number() / n() <= .5)
wt<-data %>% dplyr::select(Names,wt)%>%mutate_if(is.numeric, round,digits=3) %>% arrange(desc(wt))%>% filter(row_number() / n() <= .5)

The code above will sort and filter the top 50 percentage

join_all(list(mpg,drat,wt),by = 'Names', type = 'inner')

This will print the common in the top 50 percent from all columns.

Is there any way to do this via some package or function like a single line?

CodePudding user response:

Sure you can. Here's a single-line dplyr solution:

library(dplyr)

filter(data, if_all(c(mpg, drat, wt), ~rank(-., ties.method = "first")/n() <= 0.5))

#>      Names  mpg drat   wt
#> 1 Merc 280 19.2 3.92 3.44

CodePudding user response:

Simply use

data %>% filter(mpg>=median(mpg) & drat>=median(drat) & wt >= median(wt))
     Names  mpg drat   wt
1 Merc 280 19.2 3.92 3.44
  •  Tags:  
  • r
  • Related