So I am working on some data visualisations of satellite populations, I would like to display for each month and altitude, how many satellites were launched into each orbit.
I have a list (snippet below, which is fake data). I am trying to create a bubble plot, where for each date (grouped by month) on the X axis, the altidude (grouped by 100km) and then the size of the bubble would be the count (shown in the second table).
COSPAR_ID | LAUNCH_DATE | ALTITUDE |
---|---|---|
IRIDIUM 180 | 2019-01-01 | 1150 |
IRIDIUM 176 | 2019-02-01 | 1250 |
RISESAT | 2019-04-06 | 1150 |
RAPIS-1 | 2019-03-01 | 1375 |
MICRODRAGON | 2019-05-01 | 400 |
NEXUS (FO-99) | 2019-04-01 | 459 |
ALE-1 | 2019-05-01 | 1000 |
IRIDIUM 167 | 2019-04-01 | 900 |
IRIDIUM GSAT-31 | 2019-0-01 | 666 |
IRIDIUM 188 | 2019-06-01 | 1000 |
IRIDIUM 111 | 2019-02-01 | 1250 |
IRIDIUM 123 | 2019-01-01 | 1150 |
LAUNCH_DATE | ALTITUDE | COUNT |
---|---|---|
Jan-19 | 0-500 | 10 |
Jan-19 | 500-1000 | 100 |
Jan-19 | 1000-1500 | 150 |
Feb-19 | 0-500 | 20 |
Feb-19 | 500-1000 | 90 |
Feb-19 | 1000-1500 | 150 |
So far, I am getting quite lost. I am using dplyr to be able to summarise first by the month, and then starting to count altitudes.
df <- df %>%
group_by(month = lubridate::floor_date(LAUNCH_DATE, 'month')) %>%
summarize(sum = sum(count),
sumA = n(ALTITUDE < 100))
My next steps would be to group the altidudes first, then summarise by the date? I am hitting a brick wall. So not sure where to go next? Can anyone point me in the right direction?
Happy to add the original dataset of satellites, just quite a large file.
CodePudding user response:
You can use cut()
to make your bins, defining your breaks and labels beforehand:
library(dplyr)
library(lubridate)
alti_breaks <- seq(0, by = 500, length.out = ceiling(df$ALTITUDE / 500) 1)
alti_labs <- paste(head(alti_breaks, -1), tail(alti_breaks, -1), sep = "-")
df <- df %>%
count(
LAUNCH_DATE = floor_date(LAUNCH_DATE, 'month'),
ALTITUDE = cut(ALTITUDE, alti_breaks, alti_labs),
name = "COUNT"
)
df
# A tibble: 9 × 3
LAUNCH_DATE ALTITUDE COUNT
<date> <fct> <int>
1 2019-01-01 1000-1500 2
2 2019-02-01 1000-1500 2
3 2019-03-01 1000-1500 1
4 2019-04-01 0-500 1
5 2019-04-01 500-1000 1
6 2019-04-01 1000-1500 1
7 2019-05-01 0-500 1
8 2019-05-01 500-1000 2
9 2019-06-01 500-1000 1
And the bubble plot:
library(ggplot2)
ggplot(df, aes(LAUNCH_DATE, ALTITUDE))
geom_point(aes(size = COUNT), color = blues9[[6]], show.legend = FALSE)
theme_minimal()
theme(panel.grid.minor.x = element_blank())