I have the following data:
week <- c(1,2,3,4,1,2,3,4,1,2,3,4)
product <- c("A", "A", "A", "A", "B", "B", "B", "B", "C", "C", "C", "C")
price <- c(5,5,6,6,12,12,11,11, 23, 23, 23, 23)
mydf <- cbind.data.frame(week, product, price)
I want to create a new dummy variable that takes a value of 0 for all initial price levels and 1 if the price increased. The new variable would look as follows:
mydf$price_changed <- c(0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0 ,0)
mydf
CodePudding user response:
We can check with the previous 'price' values (lag
) after grouping by 'product' and get the cumulative sum (cumsum
)
library(dplyr)
mydf %>%
group_by(product) %>%
mutate(price_changed = cumsum(price != lag(price, default = first(price)))) %>%
ungroup
-output
# A tibble: 12 × 4
week product price price_changed
<dbl> <chr> <dbl> <int>
1 1 A 5 0
2 2 A 5 0
3 3 A 6 1
4 4 A 6 1
5 1 B 12 0
6 2 B 12 0
7 3 B 11 1
8 4 B 11 1
9 1 C 23 0
10 2 C 23 0
11 3 C 23 0
12 4 C 23 0
CodePudding user response:
You can use cumany
and first
:
library(dplyr)
mydf %>%
group_by(product) %>%
mutate(price_changed = (cumany(price != first(price))))
# A tibble: 12 × 4
# Groups: product [3]
week product price price_changed
<dbl> <chr> <dbl> <int>
1 1 A 5 0
2 2 A 5 0
3 3 A 6 1
4 4 A 6 1
5 1 B 12 0
6 2 B 12 0
7 3 B 11 1
8 4 B 11 1
9 1 C 23 0
10 2 C 23 0
11 3 C 23 0
12 4 C 23 0
or in base R:
with(mydf, ave(price, product, FUN = \(x) (x != x[1])))
CodePudding user response:
Here is a base R way with ave
and cumsum
.
week <- c(1,2,3,4,1,2,3,4,1,2,3,4)
product <- c("A", "A", "A", "A", "B", "B", "B", "B", "C", "C", "C", "C")
price <- c(5,5,6,6,12,12,11,11, 23, 23, 23, 23)
mydf <- data.frame(week, product, price)
mydf$price_changed <- with(mydf, ave(price, product, FUN = \(x) cumsum(c(0, x[-length(x)] != x[-1]))))
Created on 2022-05-13 by the reprex package (v2.0.1)
CodePudding user response:
Here is an option with data.table
:
library(data.table)
setDT(mydf)[, price_changed := cumsum(price != shift(price, type = "lag", fill = first(price))), by = product]
Output
week product price price_changed
1: 1 A 5 0
2: 2 A 5 0
3: 3 A 6 1
4: 4 A 6 1
5: 1 B 12 0
6: 2 B 12 0
7: 3 B 11 1
8: 4 B 11 1
9: 1 C 23 0
10: 2 C 23 0
11: 3 C 23 0
12: 4 C 23 0
CodePudding user response:
Option using rleid
from data.table
:
library(dplyr)
library(data.table)
mydf %>%
group_by(product) %>%
mutate(price_changed = rleid(price) - 1)
Output:
# A tibble: 12 × 4
# Groups: product [3]
week product price price_changed
<dbl> <chr> <dbl> <dbl>
1 1 A 5 0
2 2 A 5 0
3 3 A 6 1
4 4 A 6 1
5 1 B 12 0
6 2 B 12 0
7 3 B 11 1
8 4 B 11 1
9 1 C 23 0
10 2 C 23 0
11 3 C 23 0
12 4 C 23 0