I start off without including the data. The problem comes from using geom_smooth
with lots of data points (i.e. a large data set), so a minimal data example for illustration purposes seems difficult to use (I tried). But I can submit the data if requested.
I have scores on several variables and want to see trends in these scores across the age of respondents (cross-sectional data). Data are now in long format (so the original variables are all under the column 'name').
Like this:
age name value
<dbl> <chr> <dbl>
1 40 mo_clean 1
2 40 mo_groc 3
3 40 mo_trans 1
4 40 mo_digi 3
5 40 mo_emo 3
6 40 mo_activ 1
7 40 mo_supv 1
8 40 mo_doct 1
9 39 mo_clean 1
10 39 mo_groc 1
# … with 42,030 more rows
I want to:
- use
geom_smooth
andgeom_label
and - then switch to
ggrepel::geom_label_repel
to avoid overlapping labels
Getting labels to work with geom_smooth
turned out difficult, but I managed to do so with the code below:
library(ggplot2)
library(ggrepel)
df %>%
{
ggplot(df, aes(age, value, label = name, color = name))
geom_smooth(se = FALSE)
guides(color = "none")
geom_label(
data = group_by(., name) %>%
do(augment(loess(value ~ age, .))) %>%
filter(age == max(age)),
aes(age, .fitted), nudge_x = 2
)
}
scale_x_continuous(breaks = seq(35, 65, by = 5))
xlab("Age")
ylab(" ")
theme(text = element_text(size = 14))
which gives this result:
Now, as anticipated, substituting geom_label
with geom_rabel_repel
does not work, due to the many data points. I get the following error message:
`geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
Warning message:
ggrepel: 720 unlabeled data points (too many overlaps). Consider increasing max.overlaps
and all labels in the figure are dropped.
Increasing max.overlaps
is not the way to go, I assume. Just to illustrate the extreme case, with max.overlaps = Inf
:
[...]
geom_label_repel(
data = group_by(., name) %>%
do(augment(loess(value ~ age, .))) %>%
filter(age == max(age)),
aes(age, .fitted),
max.overlaps = Inf
)
[...]
Any hint? For instance where to find help (or even code suggestions)? Lots of web searches have not given me what I'm looking for: how to combine geom_smooth
with geom_label_repel
to get a nice plot with each smoothed line labelled, without labels overlapping.
—-
My question refers to geom_smooth
with lots of data points, the linked question (
So, the error message leads down the wrong road in this case.
Concerning tjebo's suggestion in an alternative answer: I believe adding stat=unique
is a better solution than the second chunk in tjebo's answer (although the resulting figure is still not what I want).
library(tidyverse)
library(ggrepel)
df <-
diamonds %>% select(age = table, name = color, value = price)
df %>%
{
ggplot(df, aes(age, value, label = name, color = name))
geom_smooth(se = FALSE)
geom_label_repel(
data = group_by(., name) %>%
do(broom::augment(loess(value ~ age, .))) %>%
filter(age == max(age)),
aes(age, .fitted),
stat = "unique"
)
}
#> `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
CodePudding user response:
I am answering because I'd like to demonstrate a way to reproduce the problem. There is hardly any problem that cannot be reproduced with one of the inbuilt data sets. For your problem, you could use the diamonds data set. It contains a similar amount of rows and has similar columns (integers and character).
library(tidyverse)
library(ggrepel)
df <-
diamonds %>% select(age = table, name = color, value = price)
df %>%
{
ggplot(df, aes(age, value, label = name, color = name))
geom_smooth(se = FALSE)
geom_label_repel(
data = group_by(., name) %>%
do(broom::augment(loess(value ~ age, .))) %>%
filter(age == max(age)),
aes(age, .fitted)
)
}
#> `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
Here, on a small scale, you can see repeated labels - that's the core of the problem, and we have reproduced your problem.
Two answers suggest how to label geom_smooth derived curves. Here an adaptation of my own suggestion, using the geom_textpath package. I admit the labels look awkwardly rotated in that case, but I guess with less awkward lines such as in your example it should look nicer.
## fix this using one of the suggested solutions in thread
## https://stackoverflow.com/questions/29357612/plot-labels-at-ends-of-lines?noredirect=1&lq=1
library(geomtextpath)
ggplot(df, aes(age, value, label = name, color = name))
## note you currently have to specify method argument, otherwise the disambiguation of some function fails.
## see also https://github.com/AllanCameron/geomtextpath/issues/79)
geom_labelsmooth(hjust = 1, method = "loess")
#> `geom_smooth()` using formula 'y ~ x'
Created on 2022-07-12 by the reprex package (v2.0.1)