Home > Enterprise >  plotting means in a horizontal bar with a vertical line
plotting means in a horizontal bar with a vertical line

Time:04-30

ID score1 score 2 score 3 score 4
1 200 300 400 -200
2 250 -310 -470 -200
3 210 400 480 -200
4 220 -10 -400 -200
5 150 -50 400 -200

I am new to R, I want to make a graph that presents the mean of each score. whereas, the scores are lined in the Y axis, and there is a vertical line which represents the 0. every score mean above zero a horizontal bar appears from the central to the right. every score mean below zero a horizonal bar appears from the central to the left. Thanks for the help!

CodePudding user response:

You could achieve your desired result by first converting your dataset to long format and by computing the means per score afterwards. After these data wrangling steps you could plot the means using ggplot2 via geom_col and add a vertical zero line using geom_vline:

df <- data.frame(
          ID = c(1L, 2L, 3L, 4L, 5L),
      score1 = c(200L, 250L, 210L, 220L, 150L),
     score.2 = c(300L, -310L, 400L, -10L, -50L),
     score.3 = c(400L, -470L, 480L, -400L, 400L),
     score.4 = c(-200L, -200L, -200L, -200L, -200L)
)

library(dplyr)
library(tidyr)
library(ggplot2)

df1 <- df |> 
  tidyr::pivot_longer(-ID, names_to = "score") |> 
  group_by(score) |> 
  summarise(value = mean(value))

ggplot(df1, aes(value, score))  
  geom_vline(xintercept = 0)  
  geom_col()

EDIT To label the bars you could use geom_text. Tricky part is to align the labels. To this end I make use of an ifelse to right align (hjust = 1) the labels in case of a positive mean and left align (hjust = 0) in case of a negative mean. Actually I did 1.1 and -.1 to add some padding between the label and the bar. The axis labels could be set via the labels argument of the scale, in your case it is scale_y_discrete. Personally I prefer to use a named vector which assign labels to categories in the data.

ggplot(df1, aes(value, score))  
  geom_vline(xintercept = 0)  
  geom_col()  
  geom_text(aes(label = value, hjust = ifelse(value > 0, 1.1, -.1)), color = "white")  
  scale_y_discrete(labels = c("score1" = "Test1", "score.2" = "Test2", "score.3" = "Test3", "score.4" = "Test4"))

enter image description here

CodePudding user response:

Similar approach with stefan's but slightly different choice of functions:

  1. The data:
dat <- structure(list(ID = 1:5, score1 = c(200L, 250L, 210L, 220L, 150L
), score2 = c(300L, -310L, 400L, -10L, -50L), score3 = c(400L, 
-470L, 480L, -400L, 400L), score4 = c(-200L, -200L, -200L, -200L, 
-200L)), class = "data.frame", row.names = c(NA, -5L))
  1. The chain of functions
dat %>% 
  select(-ID) %>% 
  map_df(mean) %>% 
  pivot_longer(everything(), names_to = "score", values_to = "means") %>%
  ggplot()   
  coord_flip()   
  geom_col(aes(x = score, y = means))
  1. The result

enter image description here

In case you want to change the labels on the tick marks ("score 1", "score2", etc) to other labels, you can use scale_x_discrete. In addition, in case you want to show the numeric value on top of each bar, you can use geom_text with hjust to adjust the label positions.

For example :

dat %>% 
  select(-ID) %>% 
  map_df(mean) %>% 
  pivot_longer(everything(), names_to = "score", values_to = "means") %>%
  ggplot()   
  coord_flip()   
  geom_col(aes(x = score, y = means))   
  scale_x_discrete(labels = c("Test A", "Test B", "Test C", "Test D"))  
  geom_text(aes(x = score, y = means, label = means), 
            hjust = c(-0.5, -0.5, -0.5, 1.1))

enter image description here

  • Related