I am trying to sort a vector of object sizes in descending order and create a dataframe. I came across an issue with the sorting because the numbers have unit denominations (e.g. Kb, Mb, etc.), and I was wondering how I can sort numbers in ascending or descending order? Because the number have denominations they are essentially treated as character vectors, and can therefore not be sorted after size.
Example 1:
library(dplyr)
l <- list(1:1e6, 1:1e1, 1:1e3, 1:1e7)
l <- sapply(
l,
function(x){
object.size(x) %>% format(units = "auto")
}
)
# Alt. A: Sorting the vector before coercing to dataframe
sort(l) %>% as.data.frame()
A data.frame: 4 × 1
.
<chr>
96 bytes
4 Kb
38.1 Mb
3.8 Mb
# Alt. B: Coerce to dataframe then sort using arrange()
as.data.frame(l) %>% arrange(desc(names(.)[1]))
A data.frame: 4 × 1
l
<chr>
3.8 Mb
96 bytes
4 Kb
38.1 Mb
Desired output:
A data.frame: 4 × 1
l
<chr>
38.1 Mb
3.8 Mb
4 Kb
96 bytes
CodePudding user response:
The problem is that your sapply
loop only keeps the formatted output, which is much harder to sort. Using purrr
you can store two values for each iteration in a data frame and bind the results together. So you can do:
library(dplyr)
l <- list(1:1e6, 1:1e1, 1:1e3, 1:1e7)
l_1 <- purrr::map_df(l, function(x) {
tibble(
size_raw = object.size(x),
size = size_raw %>% format(units = "auto")
)
})
l_1 %>%
arrange(-size_raw)
#> # A tibble: 4 × 2
#> size_raw size
#> <objct_sz> <chr>
#> 1 40000048 bytes 38.1 Mb
#> 2 4000048 bytes 3.8 Mb
#> 3 4048 bytes 4 Kb
#> 4 96 bytes 96 bytes
Created on 2022-03-16 by the reprex package (v2.0.1)