as the title says, what I'd like to have is the rownames of the top N values of each column. I have a dataframe containing TV series characters on the rows and each column is one episode. In order to get a list of most relevant characters, I think taking maybe the three most speaking characters in each episode might be a nice way. I have thought about just looping through each column, ordering it, taking the name and adding it to an array, but there must be a more efficient way to do this. Thank you all very much in advance.
CodePudding user response:
Here is one way to do it using the mtcars
dataset. It uses order()
to sort the row names based on the values of each column, using
tail()
. sapply()
is used to apply to all columns and return a
data.frame:
rn <- row.names(mtcars)
f <- function(x) tail(rn[order(x)], n = 3)
sapply(mtcars, f)
## mpg cyl disp hp
## [1,] "Lotus Europa" "Pontiac Firebird" "Chrysler Imperial" "Camaro Z28"
## [2,] "Fiat 128" "Ford Pantera L" "Lincoln Continental" "Ford Pantera L"
## [3,] "Toyota Corolla" "Maserati Bora" "Cadillac Fleetwood" "Maserati Bora"
## drat wt qsec vs
## [1,] "Ford Pantera L" "Cadillac Fleetwood" "Toyota Corona" "Fiat X1-9"
## [2,] "Porsche 914-2" "Chrysler Imperial" "Valiant" "Lotus Europa"
## [3,] "Honda Civic" "Lincoln Continental" "Merc 230" "Volvo 142E"
## am gear carb
## [1,] "Ferrari Dino" "Ford Pantera L" "Ford Pantera L"
## [2,] "Maserati Bora" "Ferrari Dino" "Ferrari Dino"
## [3,] "Volvo 142E" "Maserati Bora" "Maserati Bora"