Home > Net >  changing the bins in stat_sum for overlapping points
changing the bins in stat_sum for overlapping points

Time:06-07

I am plotting the count of samples found at each site. So the more samples, the bigger the plot point should be. The counts range from 1 to 2500. By default the legend automatically decides to display and bin the data into a range that I would like to expand on the lower end. It starts at 500, but I would like to see it start at 100, but still not get so excessive that I have more than 10 binsenter image description here

I have tried to understand the scale_size_binned command as I kind of feel this is where it would work, but when I try to set the limits or the range, it makes a legend item show for a smaller and larger count but with no numbers. So, there would be a smaller circle above the 500 with no value and a larger circle below 2500 with no value. It also puts in tick marks in the legend between values. In reality, all but 4 values are actually under 500 so Im not even sure why the scale starts at 500 by default. The range is 12-2683 mean is 339. Here is some of my data:

A tibble: 33 x 3
# Groups:   Site [20]
   Site    Sample       n
   <fct>  <chr> <int>
 1 1(20A) 20A     279
 2 1(20C) 20C      99
 3 2(20C) 20C     158
 4 3(25C) 25C     170
 5 4(25C) 25C     117
 6 5(20B) 20B      72
 7 6(20F) 20F     369
 8 7(19D) 19D     218
 9 8(20E) 20E    1044
10 9(20F) 20F     427

Here is the code Ive used

nb.cols = 20  
mycolors <- colorRampPalette(brewer.pal(12, "Paired"))(nb.cols) #tryin to find a palette with 33 colors

ggplot(data, aes(x = Sample, y = Site))  
          stat_sum(aes(size = n, color= Site))  
          scale_color_manual(values = mycolors) 
          guides(color = "none") 
         labs(x = "Sample", y = "Site", size = "Found") 
         theme_classic(base_size=14, base_family='serif')

CodePudding user response:

We could adapt the the size legend with scale_size_area() function and the breaks argument:

I have adapted the code a little: Removed stat_sum and put it in geom_point() aesthetics:

For your original data: scale_size_area(breaks = c(100, 500, 1000, 1500, 2000, 2500))

library(tidyverse)

ggplot(df, aes(x = Sample, y = Site))  
  geom_point(aes(color = factor(n), size = n))  
  scale_size_area(breaks = seq(100,1000,100))  
  scale_color_manual(values = mycolors)  
  guides(color = "none") 
  labs(x = "Sample", y = "Site", size = "Found") 
  theme_classic(base_size=14, base_family='serif')

enter image description here

CodePudding user response:

@TarJae Thank you for your answer. I also figured out another way is

    scale_size_continuous(breaks = c(25, 50, 100, 500,1000,2500), 
                                   labels = c(25,50, 100, 500, 1000,2500)) 

Then I do not need scale_size_area

  • Related