I'm having trouble using the row number as index. For example I want a new column that will give me the sales taking into account the next 4 days. I want to create column name:sale_next 4
The issue with my code is that I don't know how to make the index of the row_number()
variable, since what I'm doing is fetching the actual value of the column.
#heres to create the data
df <- read.table(text = "day price price_change sales High_sales_ind
1 5 0 12 1
2 5 0 6 0
3 5 0 5 0
4 5 0 4 0
5 5 0 10 1
6 5 0 10 1
7 5 0 10 1
8 5 0 12 1
9 5 0 14 1
10 7 2 3 0
11 7 0 2 0", header = TRUE)
#my code
df<- df %>% mutate(sales_next4 = sales[row_number():sales_rownumber() 4)
What I need:
day | price | price_change | sales | High_sales_ind | sales_next4 |
---|---|---|---|---|---|
1 | 5 | 0 | 12 | 1 | 27 |
2 | 5 | 0 | 6 | 0 | 25 |
3 | 5 | 0 | 5 | 0 | 29 |
4 | 5 | 0 | 4 | 0 | 34 |
5 | 5 | 0 | 10 | 1 | 42 |
6 | 5 | 0 | 10 | 1 | 46 |
7 | 5 | 0 | 10 | 1 | 39 |
8 | 5 | 0 | 12 | 1 | 31 |
9 | 5 | 0 | 14 | 1 | 19 |
10 | 7 | 2 | 3 | 0 | 5 |
11 | 7 | 0 | 2 | 0 | 2 |
Any help would be appreciated.
CodePudding user response:
You can use rollapply
from the zoo
package for cases like this, assuming that the days are consecutive as in the example data provided.
You'll need to use the partial =
and align =
arguments to fill the column correctly, see ?rollapply
for the details.
library(dplyr)
library(zoo)
df <- df %>%
mutate(sales_next4 = rollapply(sales, 4, sum, partial = TRUE, align = "left"))
Result:
day price price_change sales High_sales_ind sales_next4
1 1 5 0 12 1 27
2 2 5 0 6 0 25
3 3 5 0 5 0 29
4 4 5 0 4 0 34
5 5 5 0 10 1 42
6 6 5 0 10 1 46
7 7 5 0 10 1 39
8 8 5 0 12 1 31
9 9 5 0 14 1 19
10 10 7 2 3 0 5
11 11 7 0 2 0 2
CodePudding user response:
You can use map()
from purrr
to do rolling sum depending on the day
column.
library(dplyr)
library(purrr)
df %>%
mutate(sales_next4 = map_dbl(day, ~ sum(sales[between(day, .x, .x 3)])))
# day price price_change sales High_sales_ind sales_next4
# 1 1 5 0 12 1 27
# 2 2 5 0 6 0 25
# 3 3 5 0 5 0 29
# 4 4 5 0 4 0 34
# 5 5 5 0 10 1 42
# 6 6 5 0 10 1 46
# 7 7 5 0 10 1 39
# 8 8 5 0 12 1 31
# 9 9 5 0 14 1 19
# 10 10 7 2 3 0 5
# 11 11 7 0 2 0 2
CodePudding user response:
Using slider
library(dplyr)
library(slider)
df %>%
mutate(sales_next4 = slide_dbl(day, ~ sum(sales[.x]), .after = 3))
day price price_change sales High_sales_ind sales_next4
1 1 5 0 12 1 27
2 2 5 0 6 0 25
3 3 5 0 5 0 29
4 4 5 0 4 0 34
5 5 5 0 10 1 42
6 6 5 0 10 1 46
7 7 5 0 10 1 39
8 8 5 0 12 1 31
9 9 5 0 14 1 19
10 10 7 2 3 0 5
11 11 7 0 2 0 2
CodePudding user response:
You can use Reduce()
and data.table::shift()
mutate(df, sales_next4 = Reduce(` `, data.table::shift(c(sales,0,0,0),0:-3))[1:nrow(df)])
Output:
day price price_change sales High_sales_ind sales_next4
1 1 5 0 12 1 27
2 2 5 0 6 0 25
3 3 5 0 5 0 29
4 4 5 0 4 0 34
5 5 5 0 10 1 42
6 6 5 0 10 1 46
7 7 5 0 10 1 39
8 8 5 0 12 1 31
9 9 5 0 14 1 19
10 10 7 2 3 0 5
11 11 7 0 2 0 2