I have two dataframes, I am using to plot geom_area
and geom_line
. The categories are common in both dataframes, except their numerical value.
Below are my sample dataframes:
#df_one, for geom_area()
Timestamp Topic Value_A
01/01/2019 News 10
02/01/2019 Sports 11
03/01/2019 Entertainment 12
...
01/01/2020 Weather 5
02/01/2020 News 6
03/01/2020 Business 7
...
01/01/2021 Sports 8
02/01/2021 Business 4
03/01/2021 News 9
...
29/12/2021 Entertainment 12
30/12/2021 News 13
31/12/2021 Sports 14
And this is the second one
#df_two, for line plot
Timestamp Topic Value_B
01/01/2019 Weather 1.0
02/01/2019 Business 1.1
03/01/2019 News 1.2
...
01/01/2020 Entertainment 5.0
02/01/2020 Sports 6.5
03/01/2020 Business 7.3
...
01/01/2021 Sports 8.8
02/01/2021 Business 4.2
03/01/2021 Sports 9.2
...
29/12/2021 Business 1.2
30/12/2021 News 1.3
31/12/2021 Weather 1.4
I am doing the following steps:
#convert date column into proper format
df_one$Timestamp <- as.Date(df_one$Timestamp)
#sort according to dates
df_one <- df_one[order(as.Date(df_one$Timestamp, format="%Y/%m/%d")),]
library(randomcoloR)
n <- 15
my_cols_one <- distinctColorPalette(n)
names(my_cols_one) = unique(df_one$Topic) #I will use this for both since Topics are common
list_one <-
df_one %>%
## create year variable by which you split into a list
mutate(year = lubridate::year(Timestamp)) %>%
split(.$year) %>%
## pass this list to a loop function to create three separate plots
map(~ggplot(data = .x, aes(x=Timestamp, y=Frequency, fill=Topic))
scale_x_date(date_breaks = '1 month', date_labels = "%b-%y")
geom_area(alpha=0.6 , size=1, colour="black", position = position_fill())
theme(legend.position="bottom", legend.box = "horizontal")
ggtitle("Reliable")
guides(fill = guide_legend(nrow = 2, label.position = "bottom"))
scale_fill_manual(NULL, values = my_cols_one, limits = unique(.x$Topic))
)
#now for df_two
#convert date column into proper format
df_two$Timestamp <- as.Date(df_two$Timestamp)
#sort according to dates
df_two <- df_two[order(as.Date(df_one$Timestamp, format="%Y/%m/%d")),]
df_two <- df_two %>%
group_by(created_at = lubridate::floor_date(created_at, "15 days"), Topic) %>%
dplyr::summarise(Average_Value = mean(Value_B))
list_two <-
df_two %>%
## create year variable by which you split into a list
mutate(year = lubridate::year(created_at)) %>%
split(.$year) %>%
## pass this list to a loop function to create three separate plots
map(~ggplot(data = .x, aes(x=created_at, y=avg_sentiment, color=Topic))
scale_x_date(date_breaks = '1 month', date_labels = "%b-%y")
geom_line()
theme(legend.position="bottom", legend.box = "horizontal", plot.background = element_blank())
ggtitle("Title")
guides(fill = guide_legend(nrow = 2, label.position = "bottom"))
## you will need to set the limits to the unique values in each plot
## I am also removing the guide title because of the visual crowding
scale_fill_manual(NULL, values = my_cols_one, limits = unique(.x$Topic))
labs(title = '',
x = 'Date',
y = 'Average Value',
color=""))
Now finally to plot these together
do.call("grid.arrange", c(list_one, list_two, ncol=2, nrow=2))
So the idea is to have two different plots of two years on top of each other using same color, to me, the output is different.
Any help please?
CodePudding user response:
I found the solution:
I was using scale_fill_manual
for df_two whereas it should've been scale_color_manual
since I was using geom_line
.
So I changed scale_fill_manual(NULL, values = my_cols_q, limits = unique(.x$Topic))
to
scale_color_manual(NULL, values = my_cols_q, limits = unique(.x$Topic))
and its working as expected.