I have two separate data frames - each representing a feature (activity, and sleep) and the amount of days that each of these features were recorded by each id number. The amount of days need to reflect on the y-axis and the feature itself needs to reflect on the x-axis. I managed to draw the boxplots separately, showing the outliers clearly esp for the one set, however if I want to place the two boxplots next to each other, the outliers do not show up clearly. Also, how do I get the names of the two features (activity and sleep) on my x-axis?
The dataframe for the "sleep "feature:
head(idday)
A tibble: 6 x 2
id days
<dbl> <int>
1 1503960366 25
2 1644430081 4
3 1844505072 3
4 1927972279 5
5 2026352035 28
6 2320127002 1
The dataframe for the "activity "feature:
head(iddaya)
A tibble: 6 x 2
id days
<dbl> <int>
1 1503960366 31
2 1624580081 31
3 1644430081 30
4 1844505072 31
5 1927972279 31
6 2022484408 31
My attempt for sleep:
ggplot(idday, aes(y = days), boxwex = 0.05)
stat_boxplot(geom = "errorbar",
width = 0.2)
geom_boxplot(alpha=0.9, outlier.color="red")
and for activity:
ggplot(iddaya, aes(y = days), boxwex = 0.05)
stat_boxplot(geom = "errorbar",
width = 0.2)
geom_boxplot(alpha=0.9, outlier.color="red")
I then combined them:
boxplot(summary(idday$days), summary(iddaya$days))
In this final image the outliers do not show clearly, and I want to name my x-axis and y-axis.
CodePudding user response:
If you want to compare these two boxplots with each other I recommend to use the same range for your y-axis. To achieve this you first have to combine both data frames. You can do this with inner_join()
from the dplyr
package.
data_combined <- inner_join(idday, iddaya,
by = "id",
suffix = c("_sleep", "_activity"))
Then you need to transform your data frame into long-format with pivot_longer()
from the tidyr
package:
data_combined_long <- data_combined %>%
pivot_longer(days_sleep:days_activity,
names_to = "features",
names_prefix = "days_",
values_to = "days")
After that you can again use ggplot()
to create your boxplot. But now you have to define that you want your x-axis to represent your features:
ggplot(data_combined_long, aes(y = days, x = features), boxwex = 0.05)
stat_boxplot(geom = "errorbar",
width = 0.5)
geom_boxplot(alpha=0.9, outlier.color="red")
Your plot should then look like this:
CodePudding user response:
There are several ways to achieve your task. One way could be:
If your dataframes are coalled df_sleep and df_activity then we could combine them in a named list and add a new column feature
, then plot:
df_sleep
df_activity
library(tidyverse)
bind_rows(list(sleep = df_sleep, activity = df_activity), .id = 'feature') %>%
ggplot(aes(x = feature, y=days, fill=feature))
geom_boxplot()