Suppose we have the following df
:
df = structure(list(fruit = c("melon", "mango", "orange", "blueberry"
), pct = c(5, 4, 3, 2)), class = "data.frame", row.names = c(NA,
-4L))
That looks like this:
fruit pct
1 melon 5
2 mango 4
3 orange 3
4 blueberry 2
I want to start with a given number, say 30
and I would like to go across the column pct
computing differences in this way:
fruit pct desired_output
1 melon 5 30
2 mango 4 30-5=25
3 orange 3 25-4=21
4 blueberry 2 21-3=19
Note that the differences are being done with the previous element in the vector, and there is a starting point which in this case I defined as 30
.
I have tried functions like diff
and cumsum
but I'm not getting the desired output.
CodePudding user response:
Happy new year! here is a slight variation
starting_point=30
df_new<-df %>%
mutate(interim=lag(cumsum(pct))) %>%
mutate(desired_output= starting_point-interim)
df_new$desired_output[1]=starting_point
CodePudding user response:
Please find below a slightly simpler solution with only base
R
- Code
x <- 30
df$desired_output <- replace(x - lag(cumsum(df$pct)), 1, x)
- Output
df
#> fruit pct desired_output
#>1 melon 5 30
#>2 mango 4 25
#>3 orange 3 21
#>4 blueberry 2 18
CodePudding user response:
Using data table:
Data:
df = structure(list(fruit = c("melon", "mango", "orange", "blueberry"
), pct = c(5, 4, 3, 2)), row.names = c(NA, -4L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x56485756c100>)
Code:
starting_point = 30
df[,cmsum:=starting_point-cumsum(c(0,pct[-length(pct)]))]