I wish to plot a line plot of the df below by grouping the rows, so i would have 1 line for GDP, 1 line for agriculture and 1 line for services (ignoring countries for now), does anyone know if this is possible using ggplot?
My final plot would have an x axis of years and a y axis of gdp (value)
economics_df
Series Name Country 1997 1998 1999 2000
GDP (current US$) Spain 5.90077E 11 6.19215E 11 6.34908E 11 5.98363E 11
GDP (current US$) France 1.45288E 12 1.50311E 12 1.49315E 12 1.36564E 12
GDP (current US$) Monaco 2840175545 2934498443 2906093757 2647885849
GDP (current US$) Italy 1.24188E 12 1.27005E 12 1.25245E 12 1.14668E 12
GDP (current US$) Croatia 24091170703 25792876644 23677307509 21839780971
Agriculture (% of GDP) Spain 4.302210034 4.150411966 3.817378211 3.745305634
Agriculture (% of GDP) France 2.344255815 2.362459834 2.236261411 2.098357551
Agriculture (% of GDP) Monaco 2.544255815 2.342459834 2.234261411 2.108357551
Agriculture (% of GDP) Italy 2.861911574 2.768857277 2.722232363 2.56361412
Agriculture (% of GDP) Croatia 5.228986538 5.306173593 5.393085168 4.961600952
Services (% of GDP) Syria 45.65197856 44.15290647 45.68986146 41.94697681
Services(% of GDP) Lebanon 60.61030928 58.32727829 59.05884148 61.52190623
Services (% of GDP Israel 62.02333939 63.02788655 63.92563162 64.72521236
Services (% of GDP) Egypt 48.15193682 48.28789144 47.55581925 46.52599236
Services (% of GDP) Libya 44.15193682 44.28789144 45.55581925 45.55581445
CodePudding user response:
You need to get the data into the right shape. ggplot
makes plotting very easy once the data is in long form, which is easy to do with dplyr
and tidyr
:
library(dplyr)
library(ggplot2)
library(tidyr)
econ_for_plot <- economics_df |>
pivot_longer(-c(`Series Name`, Country), names_to = "year") |>
group_by(`Series Name`, year) |>
summarise(value = sum(value))
econ_for_plot
# # A tibble: 12 x 3
# # Groups: Series Name [3]
# `Series Name` year value
# <chr> <chr> <dbl>
# 1 Agriculture 1997 1.73e 1
# 2 Agriculture 1998 1.69e 1
# 3 Agriculture 1999 1.64e 1
# 4 Agriculture 2000 1.55e 1
# 5 GDP 1997 3.31e12
# 6 GDP 1998 3.42e12
# 7 GDP 1999 3.41e12
# 8 GDP 2000 3.14e12
# 9 Services 1997 2.61e 2
# 10 Services 1998 2.58e 2
# 11 Services 1999 2.62e 2
# 12 Services 2000 2.60e 2
I have used sum()
in the summarise()
call, but you could replace it with mean()
or any other function to aggregate the data. Once it is in this form you can plot it:
ggplot(econ_for_plot,
aes(
x = year,
y = value,
color = `Series Name`,
group = `Series Name`
)
)
geom_point()
geom_line()
scale_y_log10()
labs(
title = "Sum of spending",
y = "Sum of category (log scale)"
)
theme_bw()
Input data
economics_df <- structure(list(`Series Name` = c(
"GDP", "GDP", "GDP", "GDP",
"GDP", "Agriculture", "Agriculture", "Agriculture", "Agriculture",
"Agriculture", "Services", "Services", "Services", "Services",
"Services"
), Country = c(
"Spain", "France", "Monaco", "Italy",
"Croatia", "Spain", "France", "Monaco", "Italy", "Croatia", "Syria",
"Lebanon", "Israel", "Egypt", "Libya"
), `1997` = c(
5.90077e 11,
1.45288e 12, 2840175545, 1.24188e 12, 24091170703, 4.302210034,
2.344255815, 2.544255815, 2.861911574, 5.228986538, 45.65197856,
60.61030928, 62.02333939, 48.15193682, 44.15193682
), `1998` = c(
6.19215e 11,
1.50311e 12, 2934498443, 1.27005e 12, 25792876644, 4.150411966,
2.362459834, 2.342459834, 2.768857277, 5.306173593, 44.15290647,
58.32727829, 63.02788655, 48.28789144, 44.28789144
), `1999` = c(
6.34908e 11,
1.49315e 12, 2906093757, 1.25245e 12, 23677307509, 3.817378211,
2.236261411, 2.234261411, 2.722232363, 5.393085168, 45.68986146,
59.05884148, 63.92563162, 47.55581925, 45.55581925
), `2000` = c(
5.98363e 11,
1.36564e 12, 2647885849, 1.14668e 12, 21839780971, 3.745305634,
2.098357551, 2.108357551, 2.56361412, 4.961600952, 41.94697681,
61.52190623, 64.72521236, 46.52599236, 45.55581445
)), class = "data.frame", row.names = c(
NA,
-15L
))
Edit: I made the Y-axis log-scale because the range of values was large. But now I have read the comments and looked at the data more closely, I realise that this plots absolute dollars and relative percent on the same scale. So this post tells you how to construct such a plot - although it does not really make sense to do so in this case.