Home > Mobile >  Waffle Chart in R: 99 icons instead of 100 icons
Waffle Chart in R: 99 icons instead of 100 icons

Time:10-17

Here is the data:

df <- structure(list(country = c("Australia", "Australia", "Australia", 
"South Korea", "South Korea", "South Korea"), parts = c("case_1", 
"case_2", "non_case", "case_1", "case_2", "non_case"), values = c(1, 
19, 80, 1, 29, 70)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))
library(tidyverse)
library(waffle)
df %>% ggplot(aes(label = parts, values = values))  
    geom_pictogram(n_rows = 10, aes(color = parts),
                   family = 'fontawesome-webfont',
                   flip = TRUE, 
                   size = 6, 
                  show.legend = FALSE, 
                  make_proportional = TRUE
                  )  
    scale_label_pictogram(
        name = "Confirmed cases",
        values = c("male"),
    )  
    coord_equal()  theme_minimal()  
facet_wrap(~country, nrow = 2, strip.position = "top") 

In the above code, I generated the following waffle chart. With make_proportional = TRUE and n_rows = 10, I expected to get 100 icons for each country, but instead got 99 for South Korea. Only way that I can fix this problem is to calculate all proportions first and using make_proportional = FALSE, but it will take some time. Also, I feel this is a little bit strange. It would be appreciated if anyone could help me with this.

enter image description here

CodePudding user response:

There seems to be a weird rounding problem in the function of waffle. According to the documentation, the make_proportional argument does the following:

compute proportions from the raw values? (i.e. each value n will be replaced with n/sum(n)); default is FALSE.

It seems there goes something wrong with the rounding of the values. Here a demonstration using as.integer where n2 shows what currently happens and n3 are the values that actually should happen:

library(dplyr)
df %>%
  group_by(country) %>%
  mutate(n = values/sum(values)) %>%
  ungroup() %>%
  mutate(n2 = as.integer(n*100),
         n3 = as.integer(round(n*100)))
#> # A tibble: 6 × 6
#>   country     parts    values     n    n2    n3
#>   <chr>       <chr>     <dbl> <dbl> <int> <int>
#> 1 Australia   case_1        1  0.01     1     1
#> 2 Australia   case_2       19  0.19    19    19
#> 3 Australia   non_case     80  0.8     80    80
#> 4 South Korea case_1        1  0.01     1     1
#> 5 South Korea case_2       29  0.29    28    29
#> 6 South Korea non_case     70  0.7     70    70

Created on 2022-10-16 with reprex v2.0.2

As you can see it should be like n3. So what you could do is checking if the difference between your values and n2 is bigger than 0 to add 1 to your values to get the desired result like this:

library(waffle)
library(magrittr)
library(ggplot2)
library(dplyr)
library(emojifont)
library(waffle)
library(extrafont)
#> Registering fonts with R
df %>%
  group_by(country) %>%
  mutate(n = values/sum(values)) %>%
  ungroup() %>%
  mutate(n2 = as.integer(n*100)) %>%
  mutate(values = ifelse(values-n2 > 0, values   1, values)) %>%
  ggplot(aes(label = parts, values = values))  
  geom_pictogram(n_rows = 10, aes(color = parts),
                 family = 'fontawesome-webfont',
                 flip = TRUE, 
                 size = 6, 
                 show.legend = FALSE, 
                 make_proportional = TRUE
  )  
  scale_label_pictogram(
    name = "Confirmed cases",
    values = c("male"),
  )  
  coord_equal()  theme_minimal()  
  facet_wrap(~country, nrow = 2, strip.position = "top")

Created on 2022-10-16 with reprex v2.0.2

CodePudding user response:

Not a solution, but there's something strange going on. A rounding effect, perhaps.

The waffle() package has a function as part of 'make_proportion' based on these lines:

if (params[["make_proportional"]]) {
  .x[["values"]] <- .x[["values"]] / sum(.x[["values"]])
  .x[["values"]] <- round_preserve_sum(.x[["values"]], digits = 2)
  .x[["values"]] <- as.integer(.x[["values"]] * 100)
}

However, applying the first and third lines to the "South Korean" values:

 >   df
# A tibble: 3 × 3
  country     parts    values
  <chr>       <chr>     <dbl>
1 South Korea case_1        1
2 South Korea case_2       29
3 South Korea non_case     70
 >   df[["values"]] <- df[["values"]] / sum(df[["values"]])
 >   df
# A tibble: 3 × 3
  country     parts    values
  <chr>       <chr>     <dbl>
1 South Korea case_1     0.01
2 South Korea case_2     0.29
3 South Korea non_case   0.7 
 >     df[["values"]] <- as.integer(df[["values"]] * 100)
 >   df
# A tibble: 3 × 3
  country     parts    values
  <chr>       <chr>     <int>
1 South Korea case_1        1
2 South Korea case_2       28
3 South Korea non_case     70

100 becomes 99!

  • Related