Home > database >  How does `fct_reorder2` compute this result?
How does `fct_reorder2` compute this result?

Time:10-04

Despite consulting this thread, I struggle understanding the following output:

df <- tibble::tribble(
  ~color,     ~a, ~b,
  "blue",      1,  2,
  "green",     6,  2,
  "purple",    3,  3,
  "red",       2,  3,
  "yellow",    5,  1
)

Which gives:

> fct_reorder2(df$color, df$a, df$b, .fun = min, .desc = TRUE)
[1] blue   green  purple red    yellow
Levels: purple green red blue yellow

I know you are supposed to use .fun differently with fct_reorder2. The min function here computes the mininum across all provided values, which are here the values in both df$a and df$b. I still would not expect the result I got. Can someone explain, please?

CodePudding user response:

One of the linked answer's is downvoted for looking at the source but you're asking for the how so I think it makes sense to actually look at what the code for fct_reorder2 does.

# This is fine, just checking if it's a factor and assigning the value.
f <- check_factor(.f)
# Also fine, they're columns from a data.frame
stopifnot(length(f) == length(.x), length(.x) == length(.y))
# We're not using dots
ellipsis::check_dots_used()

With that out of the way we can use the subsequent code with original data:

summary <- tapply(seq_along(.x), f, function(i) .fun(.x[i], .y[i], ...))
# for us equivalent to
tapply(seq_along(df$a), df$color, function(i) {min(df$a[i], df$b[i])})
#   blue  green purple    red yellow 
#     1      2      3      2      1 

This is in this case just the pairwise min of columns df$a and df$b If you had multiple rows per color it would use the minimum value from any row or column of the factor level.

lvls_reorder(.f, order(summary, decreasing = .desc))

This just orders levels based on those values in descending order, so the color with the biggest pairwise minimum of columns a and b first. In the case of ties we can see it's sorted lexicographically leading to the output we see.

color a b pmin dense rank descending order
(Lexicographically sorted for ties)
blue 1 2 1 1 4
green 6 2 2 2 2
purple 3 3 3 3 1
red 2 3 2 2 3
yellow 5 1 1 1 5

CodePudding user response:

df <- tibble::tribble(
  ~color,     ~a, ~b,
  "blue",      1,  2,
  "green",     6,  2,
  "purple",    3,  3,
  "red",       2,  3,
  "yellow",    5,  1
)

df %>% mutate(
  nr = 1:5,
  min = ifelse(a<=b, a, b)
) %>% arrange(desc(min), nr) 

output

# A tibble: 5 x 5
  color      a     b    nr   min
  <chr>  <dbl> <dbl> <int> <dbl>
1 purple     3     3     3     3
2 green      6     2     2     2
3 red        2     3     4     2
4 blue       1     2     1     1
5 yellow     5     1     5     1

That should clear everything up.

  • Related