I have a dataframe (df1) the contains a column of values. I also have a matrix (ma1) that contains two columns which refer to row indices in df1 (i.e., row 1, col 1 of ma1 is 4 and col 2 is 6 - these refer to row 4 and 6 of df1)
I need to be able to use the row indices in ma1 to sum the values of rows in df1 for each window. So i need the output to be sum of row 4-6, row 9 to 12, row 15 to 19 etc. - which correspond with the indices values in ma1.
I have read about rowSums but unsure how to full implement this in this example
Example: ma1 row 1 is 4 and 6, so I need output to be sum of rows 4,5,6 of df1. i.e., 3.37 0 1.19. Row 2 of ma1 is 9 and 12, so I need output to be sum of rows 9,10,11,12 of df1 etc etc
df1 <- structure(list(CO2 = c(3.37, 0, 1.19, 0.889999999999986, 5.88999999999999,
0.169999999999959, 3.92000000000002, 1.46000000000004, 1.23000000000002,
2.60000000000002, 1.39999999999998, 0, 4.35999999999996, 0.649999999999977,
0.149999999999977, 2.08999999999997, 4.23999999999995, 5.69,
0, 3.38, 1.95000000000005, 3.16999999999996, 2.82999999999998,
0, 1.69, 1.36000000000001, 0.669999999999959, 0.54000000000002,
0.529999999999973, 0.95999999999998, 0.600000000000023, 0.850000000000023,
0, 0.00999999999999091, 1.77999999999997, 1.98000000000002, 1.63,
2.74000000000001, 2.56, 3.50999999999999, 0, 0, 3.37, 0, 0.630000000000052,
0, 0.270000000000039, 0.769999999999982, 0.75, 1.25999999999999,
0, 0.689999999999998, 1.12, 0.210000000000036, 2.66000000000003,
3.14000000000004, 2.24000000000001, 0.620000000000005, 0.0900000000000318,
0)), row.names = c(NA, -60L), class = c("tbl_df", "tbl", "data.frame"
))
ma1 <- structure(c(4, 6, 9, 12, 15, 19, 33, 37, 41, 54, 6, 9, 12, 15,
19, 24, 37, 41, 44, 60), .Dim = c(10L, 2L), .Dimnames = list(
NULL, c("co2_start", "co2_end")))
CodePudding user response:
You can use apply
to go over the rows of ma1
, each time, using between(row_number(),....)
to filter, and wrap in colSums()
. Here, x
, represents (each) row-vector of ma1
cbind(
ma1,
apply(ma1,1, function(x) colSums(filter(df1,between(row_number(),x[1],x[2]))))
)
Output:
co2_start co2_end
[1,] 4 6 6.95
[2,] 6 9 6.78
[3,] 9 12 5.23
[4,] 12 15 5.16
[5,] 15 19 12.17
[6,] 19 24 11.33
[7,] 33 37 5.40
[8,] 37 41 10.44
[9,] 41 44 3.37
[10,] 54 60 8.96
This data.table
approach will be faster
library(data.table)
ma1 = data.table(ma1)
ma1[, colSums(df1[co2_start:co2_end,]), by=1:nrow(ma1)]
as would this base R approach.. It is the filter/between from tidyverse that slows things down
cbind(
ma1,
apply(ma1,1, function(x) colSums(df1[x[1]:x[2],]))
)
CodePudding user response:
This approach returns only the sum of the indexes you actually want rather than every sequential pairwise index.
ma1_sum = list()
ma1_indexes = list()
df1_values_summed = list()
for(i in 1:length(ma1)){
j = i
if(i < length(ma1)){
if(i %% 2 == 0){i = i 1}
ma1_indexes[[j]] = toString(rep(ma1[i]:ma1[i 1]))
df1_values_summed[[j]] = toString(df1$CO2[rep(ma1[i]:ma1[i 1])])
ma1_sum[[j]] = sum(df1$CO2[rep(ma1[i]:ma1[i 1])])
} else return(NULL)
}
data.frame(result = unlist(ma1_sum[seq(1, length(ma1_sum), 2)]),
indexes = unlist(ma1_indexes[seq(1, length(ma1_sum), 2)]),
values_summed = unlist(df1_values_summed[seq(1, length(df1_values_summed), 2)])
)
Output:
result indexes values_summed
<dbl> <chr> <chr>
1 6.95 4, 5, 6 0.889999999999986, 5.88999999999999, 0.169999999999959
2 5.23 9, 10, 11, 12 1.23000000000002, 2.60000000000002, 1.39999999999998, 0
3 12.2 15, 16, 17, 18, 19 0.149999999999977, 2.08999999999997, 4.23999999999995, 5.69, 0
4 5.40 33, 34, 35, 36, 37 0, 0.00999999999999091, 1.77999999999997, 1.98000000000002, 1.63
5 9.07 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54 0, 0, 3.37, 0, 0.630000000000052, 0, 0.270000000000039, 0.769999999999982, 0.75, 1.259999999999…
6 6.78 6, 7, 8, 9 0.169999999999959, 3.92000000000002, 1.46000000000004, 1.23000000000002
7 5.16 12, 13, 14, 15 0, 4.35999999999996, 0.649999999999977, 0.149999999999977
8 11.3 19, 20, 21, 22, 23, 24 0, 3.38, 1.95000000000005, 3.16999999999996, 2.82999999999998, 0
9 10.4 37, 38, 39, 40, 41 1.63, 2.74000000000001, 2.56, 3.50999999999999, 0
10 14.5 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60 0, 0.630000000000052, 0, 0.270000000000039, 0.769999999999982, 0.75, 1.25999999999999, 0, 0.689…