I am including marginal distribution plots on a scatterplot of a continuous and integer variable. However, in the integer variable maringal distribution plot (y-axis) there is this zig-zag pattern that shows up because the y-values are all integers. Is there any way to increase the "width" (not sure that's the right term) of the bins/values the function calculates the distribution density over?
The goal is to get rid of that zig-zag pattern that develops because the y-values are integers.
library(GlmSimulatoR)
library(ggplot2)
library(patchwork)
### Create right-skewed dataset that has one continous variable and one integer variable
set.seed(123)
df1 <- data.frame(matrix(ncol = 2, nrow = 1000))
x <- c("int","cont")
colnames(df1) <- x
df1$int <- round(rgamma(1000, shape = 1, scale = 1),0)
df1$cont <- round(rgamma(1000, shape = 1, scale = 1),1)
p1 <- ggplot(data = df1, aes(x = cont, y = int))
geom_point(shape = 21, size = 2, color = "black", fill = "black", stroke = 1, alpha = 0.4)
xlab("Continuous Value")
ylab("Integer Value")
theme_bw()
theme(panel.grid = element_blank(),
text = element_text(size = 16),
axis.text.x = element_text(size = 16, color = "black"),
axis.text.y = element_text(size = 16, color = "black"))
dens1 <- ggplot(df1, aes(x = cont))
geom_density(alpha = 0.4)
theme_void()
theme(legend.position = "none")
dens2 <- ggplot(df1, aes(x = int))
geom_density(alpha = 0.4)
theme_void()
theme(legend.position = "none")
coord_flip()
dens1 plot_spacer() p1 dens2
plot_layout(ncol = 2, nrow = 2, widths = c(6,1), heights = c(1,6))
CodePudding user response:
From ?geom_density
:
adjust: A multiplicate [sic] bandwidth adjustment. This makes it possible to adjust the bandwidth while still using the a bandwidth estimator. For example, ‘adjust = 1/2’ means use half of the default bandwidth.
So as a start try e.g. geom_density(..., adjust = 2)
(bandwidth twice as wide as default) and go from there.