How to calculate the sum of periods over each column for each row in R-CodePudding

I would like to calculate the sum of each flower in each year in R. Below is an example of how the table looks (Table 1) and what I want the outcome to be (Table 2). I know how to do the code calculation in a long table format but I am not sure how to do it in a wide table format. Note: I am using package: dplyr

(Table 1)

flower	1902	1950	2010	2012	2021
lily	23	0	0	8	5
rose	50	60	5	16	0
daisy	30	7	10	2	0

I need to calculate the sum for each flower in each year. The end result should give me:

(Table 2)

flower	1902	1950	2010	2012	2021
lily	23	23	23	31	36
rose	50	110	115	131	131
daisy	30	37	47	49	49

CodePudding user response：

One option involving dplyr and purrr might be:

dat %>%
    mutate(pmap_dfr(across(-1), ~ cumsum(c(...))))

  flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

CodePudding user response：

Using rowCumsums from matrixStats

library(matrixStats)
df1[-1] <- rowCumsums(as.matrix(df1[-1]))

-output

df1
  flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

CodePudding user response：

Here is one way of getting your expected result:

Your data frame :

dat <- structure(list(flower = c("lily", "rose", "daisy"), X1902 = c(23L, 
50L, 30L), X1950 = c(0L, 60L, 7L), X2010 = c(0L, 5L, 10L), X2012 = c(8L, 
16L, 2L), X2021 = c(5L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-3L))

Apply a function that calculate the cumulative sums and apply to each row of the data at column 2 to 6:

dat[1:nrow(dat), 2:6] <- t(apply(dat[1:nrow(dat), 2:6], 1, function(x) cumsum(c(x))))

# The result
dat
  flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

@benson23 has kindly suggested the following simpler code to get the same result:

dat[, 2:6] <- t(apply(dat[,2:6], 1, cumsum))

 flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

CodePudding user response：

You can use apply with cumsum, plus a little bit of re-formatting.

setNames(as.data.frame(cbind(df[, 1], t(apply(df[, -1], 1, cumsum)))), colnames(df))

  flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

Data

df <- structure(list(flower = c("lily", "rose", "daisy"), X1902 = c(23L, 
50L, 30L), X1950 = c(0L, 60L, 7L), X2010 = c(0L, 5L, 10L), X2012 = c(8L, 
16L, 2L), X2021 = c(5L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response：

Here is an alternative using pivoting:

library(dplyr)
library(tidyr)

dat %>% 
  pivot_longer(-flower) %>% 
  group_by(flower) %>% 
  mutate(value = cumsum(value)) %>% 
  pivot_wider() %>% 
  ungroup()

  flower X1902 X1950 X2010 X2012 X2021
  <chr>  <int> <int> <int> <int> <int>
1 lily      23    23    23    31    36
2 rose      50   110   115   131   131
3 daisy     30    37    47    49    49