I want to plot only once the text of the mean for the specific cluster.
but what I want is this:
code for reproduction:
price_l <- rep(c('€€-€€€', '€€-€€€', '€€€€', '€€-€€€', '€€-€€€',
'€€-€€€', '€€€€', '€€-€€€', '€€€€', '€€-€€€',
'€€-€€€', '€€-€€€', '€€-€€€', '€€-€€€',
'€€-€€€', '€€-€€€', '€€-€€€', '€€-€€€', '€€€€','€', '€',
'€', '€','€€€€', '€'),100)
avg_r <- rep(c(4.5, 3.5, 4.0, 4.0, 4.0, 3.5, 4.5, 4.0, 3.0, 4.0,
3.0, 5.0, 4.5, 4.0, 3.0,
3.5, 4.5, 3.5, 3.5, 4.0, 3.0, 4.0, 4.0, 2.5, 4.5),100)
sub.df <- data.frame(price_l, avg_r)
sub.df %>%
group_by(price_l) %>%
mutate(mean = mean(avg_r)) %>%
ungroup() %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
geom_text(aes(label = sprintf("%.2f",mean)))
CodePudding user response:
We could use stat_summary(aes(label = ..y..), geom = "text", fun = mean, color="black", size = 6, fontface = 2)
sub.df %>%
group_by(price_l) %>%
mutate(mean = mean(avg_r)) %>%
ungroup() %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
stat_summary(aes(label = ..y..), geom = "text", fun = mean, color="black", size = 6, fontface = 2)
CodePudding user response:
You can set the y
value manually inside the geom_text
sub.df %>%
group_by(price_l) %>%
mutate(mean = mean(avg_r)) %>%
ungroup() %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
geom_text(aes(y = 3.5, label = sprintf("%.2f",mean)),
check_overlap = TRUE, size = 6, fontface = 2)
Or, as r2evans suggests:
sub.df %>%
group_by(price_l) %>%
mutate(mean = mean(avg_r)) %>%
ungroup() %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
geom_text(aes(y = mean, label = sprintf("%.2f",mean)),
check_overlap = TRUE, size = 6, fontface = 2)
CodePudding user response:
For what it's worth, here's a way to do this using stat_summary()
. This has an advantage over the previous method in that: (1) there's no need to summarize beforehand via group_by()
... mutate()
... functions, and (2) it avoids overplotting that will occur if you use geom_text()
.
The answer proposed using geom_text()
alone works just fine for the result, but you'll note that this will result in overplotting. The reason is that geom_text()
like all other geoms will draw "a thing" on the plot for every observation in the dataset. The dataframe resulting from the pipe (%>%
) commands above the initial ggplot()
call should have 2500 observations. This means that if you ask geom_text()
to create a label/text at a specific position, it will do so... 2500 times.
To avoid this, you should do one of two things:
Create a separate dataframe of aggregated data containing only 3 observations (three pieces of text here) and use
geom_text(data = that_new_dataframe...)
, orUse
stat_summary()
and have that do all the summarizing for you based on the original dataset,sub.df
.
For the stat_summary()
method, you can create a userfunction to return a label
and y
value (satisfying the aesthetics required for geom_text()
and then apply that to your dataset within stat_summary()
via the fun.data=
argument:
my_fun <- function(x){
return(data.frame(y=mean(x), label=sprintf("%.2f", mean(x))))
}
sub.df %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
stat_summary(
geom="text", fun.data="my_fun", size=8,
aes(group=price_l)
)
Note: after posting this I realize it's similar to @TarJae's answer... but kept it here due to the further explanation.