Home > Net >  How to calculate the sum of periods over each column for each row in R
How to calculate the sum of periods over each column for each row in R

Time:05-09

I would like to calculate the sum of each flower in each year in R. Below is an example of how the table looks (Table 1) and what I want the outcome to be (Table 2). I know how to do the code calculation in a long table format but I am not sure how to do it in a wide table format. Note: I am using package: dplyr

(Table 1)

flower 1902 1950 2010 2012 2021
lily 23 0 0 8 5
rose 50 60 5 16 0
daisy 30 7 10 2 0

I need to calculate the sum for each flower in each year. The end result should give me:

(Table 2)

flower 1902 1950 2010 2012 2021
lily 23 23 23 31 36
rose 50 110 115 131 131
daisy 30 37 47 49 49

CodePudding user response:

One option involving dplyr and purrr might be:

dat %>%
    mutate(pmap_dfr(across(-1), ~ cumsum(c(...))))

  flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

CodePudding user response:

Using rowCumsums from matrixStats

library(matrixStats)
df1[-1] <- rowCumsums(as.matrix(df1[-1]))

-output

df1
  flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

CodePudding user response:

Here is one way of getting your expected result:

Your data frame :

dat <- structure(list(flower = c("lily", "rose", "daisy"), X1902 = c(23L, 
50L, 30L), X1950 = c(0L, 60L, 7L), X2010 = c(0L, 5L, 10L), X2012 = c(8L, 
16L, 2L), X2021 = c(5L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-3L))

Apply a function that calculate the cumulative sums and apply to each row of the data at column 2 to 6:

dat[1:nrow(dat), 2:6] <- t(apply(dat[1:nrow(dat), 2:6], 1, function(x) cumsum(c(x))))

# The result
dat
  flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

@benson23 has kindly suggested the following simpler code to get the same result:

dat[, 2:6] <- t(apply(dat[,2:6], 1, cumsum))

 flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

CodePudding user response:

You can use apply with cumsum, plus a little bit of re-formatting.

setNames(as.data.frame(cbind(df[, 1], t(apply(df[, -1], 1, cumsum)))), colnames(df))

  flower X1902 X1950 X2010 X2012 X2021
1   lily    23    23    23    31    36
2   rose    50   110   115   131   131
3  daisy    30    37    47    49    49

Data

df <- structure(list(flower = c("lily", "rose", "daisy"), X1902 = c(23L, 
50L, 30L), X1950 = c(0L, 60L, 7L), X2010 = c(0L, 5L, 10L), X2012 = c(8L, 
16L, 2L), X2021 = c(5L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response:

Here is an alternative using pivoting:

library(dplyr)
library(tidyr)

dat %>% 
  pivot_longer(-flower) %>% 
  group_by(flower) %>% 
  mutate(value = cumsum(value)) %>% 
  pivot_wider() %>% 
  ungroup()
  flower X1902 X1950 X2010 X2012 X2021
  <chr>  <int> <int> <int> <int> <int>
1 lily      23    23    23    31    36
2 rose      50   110   115   131   131
3 daisy     30    37    47    49    49
  •  Tags:  
  • r
  • Related