Home > other >  group_by is sorting numbers out of order in R
group_by is sorting numbers out of order in R

Time:07-06

I'm trying to make a stacked histogram, and have run into issues grouping/sorting data that I haven't been able to resolve. For example, when I group my data, the output says that 100 < 90 (I think because 1 < 9), and I can't figure out how to get R to group using the entire number.

Here's an example:

library(tidyverse)

# Fake data
Sims <- seq(1,100,1)
CoOc <- sample(90:140, 100, replace = TRUE)
Out <- sample(c("A Wins", "B Wins", "Tie"), 100, replace = TRUE)
df <- data.frame(cbind(Sims, CoOc, Out))

# Ordering data for stacked histogram
df2 <- df %>% 
  group_by(CoOc, Out) %>% # Grouping by CoOc for the x-axis ordering, and then for Out to get outcome (A Wins, B Wins, Tie) grouped together
  summarize(Counts = n())

# Plotting
ggplot(df2)  
  geom_bar(aes(fill = Out, y = Counts, x = CoOc), 
           position = "stack", stat = "identity") 
  labs(title="Example",
       x="CoOc",
       y="Num") 
  scale_fill_manual(name = "Outcome",
                    values = c("#AD1457", "#B3E5FC", "#FF9800")) 
  theme_bw()  
  theme(panel.border = element_blank(), 
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), 
        axis.line = element_line(colour = "white"), 
        plot.caption.position = "plot", 
        plot.caption = element_text(hjust = 0))

Which gives me a plot that looks like this (notice the x-axis initially ascends, but then drops down to a lower number): enter image description here

Thanks so much for any help––I'm sorry if this is a dumb question!

CodePudding user response:

When you using the ggplot2 command, just convert the "CoOc" to numeric (x = as.numeric(CoOc)) and then plot.

library(tidyverse)

# Fake data
Sims <- seq(1,100,1)
CoOc <- sample(90:140, 100, replace = TRUE)
Out <- sample(c("A Wins", "B Wins", "Tie"), 100, replace = TRUE)
df <- data.frame(cbind(Sims, CoOc, Out))

# Ordering data for stacked histogram
df2 <- df %>% 
  group_by(CoOc, Out) %>% # Grouping by CoOc for the x-axis ordering, and then for Out to get outcome (A Wins, B Wins, Tie) grouped together
  summarize(Counts = n()) 

# Plotting
ggplot(df2)  
  geom_bar(aes(fill = Out, y = Counts, x = as.numeric(CoOc)), 
           position = "stack", stat = "identity") 
  labs(title="Example",
       x="CoOc",
       y="Num") 
  scale_fill_manual(name = "Outcome",
                    values = c("#AD1457", "#B3E5FC", "#FF9800")) 
  theme_bw()  
  theme(panel.border = element_blank(), 
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), 
        axis.line = element_line(colour = "white"), 
        plot.caption.position = "plot", 
        plot.caption = element_text(hjust = 0))

enter image description here

  • Related