I have a large dataframe where the first two columns are longitude and latitude values (making each row represent a point of a specific longitude and latitude), and the succeeding columns are daily data for a long period of time (couple of years). As such, the dataframe is very large.
It looks like this: large dataframe
Is there a way to transform this dataframe to a monthly one?
- First I want to get the total of each month per year,
- Then I want to get the average of the totals of each month, eventually getting a dataframe with 14 columns: lon, lat, january, february, ..., december.
Is this possible?
I was able to do this when my data only spanned one year using the following code:
mymonthlydata$`1`=apply(mydailydata[3:33],1,sum)
where I did this for each month (columns 3 to 33 are the days of january for example), but now that I have multiple years, I am not quite sure what to do. Any help would be appreciated, even just pointing me to the right direction. Thank you.
Some sample data that mimics OP's screenshot that can be used
dt <- data.table::data.table(lon = (0:9) abs(rnorm(10)), lat = -89.5)[, as.character(seq.Date(as.Date("2015-01-01"), as.Date("2021-12-31"), by = "1 days")) := abs(rnorm(10)/10^8)]
str(dt)
# Classes ‘data.table’ and 'data.frame': 10 obs. of 2559 variables:
# $ lon : num 0.702 1.77 3.55 4.294 4.712 ...
# $ lat : num -89.5 -89.5 -89.5 -89.5 -89.5 -89.5 -89.5 -89.5 -89.5 -89.5
# $ 2015-01-01: num 0.00000000485 0.00000000474 0.0000000101 0.00000001026 0.00000000282 ...
# $ 2015-01-02: num 0.00000000485 0.00000000474 0.0000000101 0.00000001026 0.00000000282 ...
# $ 2015-01-03: num 0.00000000485 0.00000000474 0.0000000101 0.00000001026 0.00000000282 ...
CodePudding user response:
sample data that mimics OP's screenshot
dt <- data.table(lon = (0:9) abs(rnorm(10)), lat = -89.5)[, as.character(seq.Date(as.Date("2015-01-01"), as.Date("2021-12-31"), by = "1 days")) := abs(rnorm(10)/10^8)]
data.table solution
library(data.table)
library(lubridate)
# setDT(dt) needed only if you have a data.frame (my sample is already a data.table)
dt_long <- melt(dt, id.vars = c("lon", "lat"), variable.name = "date")
dt_long[, `:=` (year = year(date), month = month(date, label = T, abbr = F)), by = date]
dt_long_sum <- dt_long[, .(sum = sum(value)), by = .(lon, lat, year, month)]
dt_long_mean <- dt_long_sum[, .(mean = mean(sum)), by = .(lon, lat, month)]
dcast(dt_long_mean, lon lat ~ month, value.var = "mean")
results
# lon lat January February March April May June July August September October
# 1: 0.7016 -89.5 0.000000150444 0.000000137272 0.000000150444 0.000000145591 0.000000150444 0.000000145591 0.000000150444 0.000000150444 0.000000145591 0.000000150444
# 2: 1.7702 -89.5 0.000000147013 0.000000134141 0.000000147013 0.000000142270 0.000000147013 0.000000142270 0.000000147013 0.000000147013 0.000000142270 0.000000147013
# 3: 3.5504 -89.5 0.000000313048 0.000000285639 0.000000313048 0.000000302950 0.000000313048 0.000000302950 0.000000313048 0.000000313048 0.000000302950 0.000000313048
# 4: 4.2941 -89.5 0.000000318180 0.000000290321 0.000000318180 0.000000307916 0.000000318180 0.000000307916 0.000000318180 0.000000318180 0.000000307916 0.000000318180
# 5: 4.7121 -89.5 0.000000087354 0.000000079705 0.000000087354 0.000000084536 0.000000087354 0.000000084536 0.000000087354 0.000000087354 0.000000084536 0.000000087354
# 6: 5.6789 -89.5 0.000000202693 0.000000184945 0.000000202693 0.000000196154 0.000000202693 0.000000196154 0.000000202693 0.000000202693 0.000000196154 0.000000202693
# 7: 6.4409 -89.5 0.000000168332 0.000000153593 0.000000168332 0.000000162902 0.000000168332 0.000000162902 0.000000168332 0.000000168332 0.000000162902 0.000000168332
# 8: 8.3095 -89.5 0.000000037445 0.000000034166 0.000000037445 0.000000036237 0.000000037445 0.000000036237 0.000000037445 0.000000037445 0.000000036237 0.000000037445
# 9: 8.8061 -89.5 0.000000051242 0.000000046756 0.000000051242 0.000000049589 0.000000051242 0.000000049589 0.000000051242 0.000000051242 0.000000049589 0.000000051242
# 10: 9.2667 -89.5 0.000000338410 0.000000308779 0.000000338410 0.000000327493 0.000000338410 0.000000327493 0.000000338410 0.000000338410 0.000000327493 0.000000338410
# November December
# 1: 0.000000145591 0.000000150444
# 2: 0.000000142270 0.000000147013
# 3: 0.000000302950 0.000000313048
# 4: 0.000000307916 0.000000318180
# 5: 0.000000084536 0.000000087354
# 6: 0.000000196154 0.000000202693
# 7: 0.000000162902 0.000000168332
# 8: 0.000000036237 0.000000037445
# 9: 0.000000049589 0.000000051242
# 10: 0.000000327493 0.000000338410