Home > Net >  geom_bar(), Y-axis goes way above data value
geom_bar(), Y-axis goes way above data value

Time:10-13

I am trying to visualize a data frame from a survey. I'm currently trying to plot a barplot with geom_bar(), that takes in "Life Satisfaction" as the y-axis, and "Family Values" as the x-axis. Note that the survey answer for Life Satisfaction is 1(very unsatisfied) to 10(very satisfied).

But for some reason when I try to plot this barplot, the y-axis goes way above 10, and I don't understand why.

This is my code:

df1 %>%
  filter(df1$B_COUNTRY_ALPHA == "PAK") %>%
  drop_na(Q49) %>%
  ggplot(aes(x = Q1, y = Q49, fill = B_COUNTRY_ALPHA))  
  geom_bar(stat = "identity")  
  labs(x = "Family Value",
       y = "Life Satisfaction")

This is the graph that I get when I run it: enter image description here

This is the first 20 rows of data that I want to work with:

enter image description here

On a side note: I was thinking of finding the mean of the Life Satisfaction data and maybe that will make the plot make sense but I am not sure how to do that

@GregorThomas I followed your instructions and I got this.

structure(list(B_COUNTRY_ALPHA = c("PAK", "PAK", "PAK", "PAK", 
"PAK", "PAK", "PAK", "PAK", "PAK", "PAK", "PAK", "PAK", "PAK", 
"PAK", "PAK", "PAK", "PAK", "PAK", "PAK", "PAK"), Q49 = c(7, 
10, 10, 5, 1, 6, 6, 10, 10, 10, 4, 4, 8, 10, 10, 10, 10, 9, 10, 
8), Q1 = c(1, 2, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1), Q2 = c(1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 2, 1, 
4, 1, 2, 2, 2), Q3 = c(2, 2, 1, 1, 3, 1, 2, 2, 2, NA, 2, 4, 1, 
1, 2, 2, 4, 2, 4, 2), Q4 = c(3, 4, 2, 4, 2, 3, 4, 2, 1, 4, 4, 
4, 4, 1, 3, 4, 3, 4, 4, 2), Q5 = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 2, 1, 2, 1, 1, 1, 4, 1, 1, 4), Q6 = c(1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 4), Q57 = c(2, 2, 2, 1, 1, 
1, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 1), Q106 = c(7, 5, 
10, 4, 10, 7, 1, 10, 10, 10, 1, 10, 1, 10, 10, 10, 9, 4, 10, 
6), Q107 = c(7, 6, 5, 5, 10, 3, 1, 10, 10, NA, 1, 1, 1, 10, 3, 
10, 10, 8, 10, 4), Q108 = c(7, 9, 1, 4, 1, 1, 10, 10, 5, 10, 
10, 10, 1, 10, 10, 10, 10, 10, 1, 3), Q109 = c(6, 4, 1, 4, 1, 
1, 1, 10, 10, 1, 6, 2, 10, 5, 10, 1, 10, 9, 1, 4), Q110 = c(6, 
3, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 10, 1, 10, 3, 1, 3), Q112 = 
c(8, 
8, 10, 6, 10, 5, 10, 10, 10, 10, NA, 10, 10, 10, 10, 10, 10, 
10, 10, 7), Q163 = c(6, 2, 10, 7, 9, 10, 10, 10, 10, NA, 10, 
10, 6, 10, 3, NA, 8, 7, NA, 9), Q164 = c(4, 9, 10, 8, 10, 10, 
10, 10, 10, 10, 10, 10, 10, 10, 10, NA, 8, 10, 10, 10), Q222 = c(2, 
1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 4, NA, 1, NA, 2, 3, NA, 3), 
Q260 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 
1, 1, 0, 1), Q262 = c(33, 21, 60, 18, 60, 50, 45, 29, 62, 
46, 35, 40, 30, NA, 45, NA, 30, 50, 36, 34), Q273 = c(1, 
6, 1, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), 
Q275 = c(0, 2, 3, 3, 3, 2, 3, 2, 4, 0, 0, 0, 1, NA, 3, NA, 
1, 1, 0, 1), Q281 = c(8, 0, 3, 0, 10, 3, 4, 6, 3, 8, 4, 4, 
4, 0, 5, 0, 0, 0, 9, 0)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -20L))

CodePudding user response:

Here's a couple ideas using your sample data:

  • Use a dodged bar plot:
sample_data %>% 
  ggplot(aes(x = factor(Q1), fill = factor(Q49)))  
  geom_bar(position = position_dodge(preserve = 'single'))  
  labs(x = "Family Value",
       y = "Count of Responses",
       fill = "Life Satisfaction")

enter image description here

  • Use facets:
sample_data %>% 
  ggplot(aes(x = factor(Q49), fill = factor(Q49)))  
  geom_bar()  
  labs(x = "Life Satisfaction",
       y = "Count of Responses",
       fill = "Life Satisfaction")  
  facet_wrap(vars(paste("Family Value", Q1)))

enter image description here

  • Use a heat map:
sample_data %>%
  ggplot(aes(x = factor(Q1),y = factor(Q49)))  
  geom_bin2d()  
  coord_fixed()   
  labs(y = "Life Satisfaction", x = "Family Value")

enter image description here

  • Related