I have a very simple dataframe called comparison_word_avg:
index uniform blur man_check_no_blur
1 1 94.71% 93.83% 91.96%
2 2 87.14% 86.08% 83.04%
3 3 81.29% 88.95% 80.36%
4 4 79.71% 71.59% 75.89%
5 5 75.57% 66.43% 67.86%
I would like a line chart. on the x-axis the index from 1 to 5 and on the y-axis percentage from 0% to 100%. Then three lines representing the three columns "unifrom", "blur", "man_check_no_blur".
What I tried:
ggplot(comparison_word_avg, aes(x = what))
geom_line(aes(y = uniform), color = "blue")
geom_line(aes(y = blur), color = "red")
geom_line(aes(y = man_check_no_blur), color = "green")
it gives me this message three times:
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
Thanks for your help! Best,
CodePudding user response:
Because they have a percentage sign in them, your columns are stored in character format, not numeric format. This means ggplot
does not know that you want them treated like numbers. To convert them, we can strip away the percentage sign, convert to numeric, then divide by 100 to get the percentages as proportions. When we plot, we can ask ggplot to label these proportions as percentages again.
This only solves part of the problem. In ggplot
, it is better for your data to be in long format, i.e. have all the percentages in a single column, with another column labelling each observation according to the series it came from. If we do this, we can map the series to the color aesthetic, which will automatically add a legend for us. We can use pivot_longer
to easily reshape your data into long format.
Putting this together, we have.
library(tidyverse)
comparison_word_avg %>%
mutate(across(where(is.character), ~ as.numeric(sub("%", "", .x))/100)) %>%
pivot_longer(-1) %>%
ggplot(aes(index, value, color = name))
geom_line(size = 2)
scale_color_brewer(palette = "Set1")
labs(color = "")
scale_y_continuous(labels = scales::percent, name = "Percent")
theme_bw(base_size = 16)