I want to make a scatterplot of PM2.5 against life expectancy, within this i want 5 subcategories based on the GDP data (5 different colour plots and lines based on High to low GDP). How would i modify my current code to do this (or similar)? Code and data below, any help much appreciated.
plot = ggplot(dat6, aes(x=log(PM2.5), y= log(Lifeex)))
geom_point(colour = 'blue')
stat_smooth(method = "lm", col = "red")
xlab("Life Expectancy")
ylab("Concentration of PM2.5")
ggtitle("Relationship between Life expectancy and PM2.5")
dat6
Country Life_Expectancy GDP PM2.5
1 Afghanistan 60.38333 1788.3152 53.933333
2 Albania 77.03333 10642.3801 20.408333
3 Algeria 75.16667 13674.2199 31.521667
4 Angola 51.96667 6770.9149 37.346667
5 Antigua and Barbuda 75.98333 20893.5925 20.415000
6 Argentina 75.93333 19838.7166 11.893333
7 Armenia 74.26667 7728.3425 33.143333
8 Australia 82.36667 43862.4894 7.338333
9 Austria 84.00000 46586.1927 14.303333
10 Azerbaijan 72.00000 16804.9607 20.308333
CodePudding user response:
Here is an example of what the question asks for.
cut
is used to create a new column GDP_Level
based on a break points vector brks
. The levels are assigned names, ranging from "Very Low"
to "Very High"
.
As for the plot I have removed the log
transformations from the coordinates code and included then as transformations in both scale_*continuous
instead.
dat6 <- read.table(text = "
Country Life_Expectancy GDP PM2.5
1 Afghanistan 60.38333 1788.3152 53.933333
2 Albania 77.03333 10642.3801 20.408333
3 Algeria 75.16667 13674.2199 31.521667
4 Angola 51.96667 6770.9149 37.346667
5 'Antigua and Barbuda' 75.98333 20893.5925 20.415000
6 Argentina 75.93333 19838.7166 11.893333
7 Armenia 74.26667 7728.3425 33.143333
8 Australia 82.36667 43862.4894 7.338333
9 Austria 84.00000 46586.1927 14.303333
10 Azerbaijan 72.00000 16804.9607 20.308333
", header = TRUE)
library(ggplot2)
brks <- c(0, 5000, 10000, 20000, 40000, Inf)
dat6$GDP_Level <- cut(dat6$GDP, breaks = brks, labels = c("Very Low", "Low", "Medium", "High", "Very High"))
ggplot(dat6, aes(x = PM2.5, y = Life_Expectancy, color = GDP_Level))
geom_point(colour = 'blue')
stat_smooth(formula = y ~ x, method = "lm", col = "red")
xlab("Life Expectancy")
ylab("Concentration of PM2.5")
scale_x_continuous(trans = "log")
scale_y_continuous(trans = "log")
ggtitle("Relationship between Life expectancy and PM2.5")
Created on 2022-02-21 by the reprex package (v2.0.1)