I created a scatter plot using ggplot2 package for my data. Since my data has large number of points, I will explain my problem with already available small dataset. Consider this scatter plot :
ggplot(mtcars, aes(x=wt, y=mpg)) geom_point()
I want to use k means clustering to cluster these data points, but then also show the clusters on the same scatterplot(the one shown above) and not a new dimensionality reduction plot? How can I do this?
CodePudding user response:
One option would be to use ggforce::geom_mark_ellipse
to draw some ellipses around your clusters:
library(ggplot2)
library(ggforce)
km.mtcars <- kmeans(scale(mtcars), centers = 3)
mtcars2 <- mtcars
mtcars2$cluster = km.mtcars$cluster
ggplot(mtcars2, aes(x=wt, y=mpg))
geom_point()
ggforce::geom_mark_ellipse(aes(fill = factor(cluster)))
CodePudding user response:
Here is an alternative using factoextra
package:
library(factoextra)
df <- mtcars %>%
select(x = wt, y = mpg)
# Compute k-means with k = 3
set.seed(123)
res.km <- kmeans(scale(df[, -5]), 3, nstart = 25)
res.km$cluster
fviz_cluster(res.km, data = df[, -5],
palette = c("steelblue", "gold", "limegreen"),
geom = "point",
ellipse.type = "convex",
ggtheme = theme_bw()
)