Home > Enterprise >  Using row index number to calculate values
Using row index number to calculate values

Time:05-09

I'm having trouble using the row number as index. For example I want a new column that will give me the sales taking into account the next 4 days. I want to create column name:sale_next 4

The issue with my code is that I don't know how to make the index of the row_number() variable, since what I'm doing is fetching the actual value of the column.

#heres to create the data
df <- read.table(text = "day    price   price_change    sales   High_sales_ind
1 5 0   12  1
2   5   0   6   0
3   5   0   5   0
4   5   0   4   0
5   5   0   10  1
6   5   0   10  1
7   5   0   10  1
8   5   0   12  1
9   5   0   14  1
10  7   2   3   0
11  7   0   2   0", header = TRUE)


#my code

df<- df %>% mutate(sales_next4 = sales[row_number():sales_rownumber() 4)

What I need:

day price price_change sales High_sales_ind sales_next4
1 5 0 12 1 27
2 5 0 6 0 25
3 5 0 5 0 29
4 5 0 4 0 34
5 5 0 10 1 42
6 5 0 10 1 46
7 5 0 10 1 39
8 5 0 12 1 31
9 5 0 14 1 19
10 7 2 3 0 5
11 7 0 2 0 2

Any help would be appreciated.

CodePudding user response:

You can use rollapply from the zoo package for cases like this, assuming that the days are consecutive as in the example data provided.

You'll need to use the partial = and align = arguments to fill the column correctly, see ?rollapply for the details.

library(dplyr)
library(zoo)

df <- df %>% 
  mutate(sales_next4 = rollapply(sales, 4, sum, partial = TRUE, align = "left"))

Result:

   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2

CodePudding user response:

You can use map() from purrr to do rolling sum depending on the day column.

library(dplyr)
library(purrr)

df %>%
  mutate(sales_next4 = map_dbl(day, ~ sum(sales[between(day, .x, .x 3)])))

#    day price price_change sales High_sales_ind sales_next4
# 1    1     5            0    12              1          27
# 2    2     5            0     6              0          25
# 3    3     5            0     5              0          29
# 4    4     5            0     4              0          34
# 5    5     5            0    10              1          42
# 6    6     5            0    10              1          46
# 7    7     5            0    10              1          39
# 8    8     5            0    12              1          31
# 9    9     5            0    14              1          19
# 10  10     7            2     3              0           5
# 11  11     7            0     2              0           2

CodePudding user response:

Using slider

library(dplyr)
library(slider)
df %>% 
  mutate(sales_next4 = slide_dbl(day, ~ sum(sales[.x]), .after = 3))
   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2

CodePudding user response:

You can use Reduce() and data.table::shift()

mutate(df, sales_next4 = Reduce(` `, data.table::shift(c(sales,0,0,0),0:-3))[1:nrow(df)])

Output:

   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2
  • Related