Hi I would like to create a lag by one whole group on a dataframe in R. So lets say the value for the group A is 11 I would like to make a lag where all values of group B are 11 and so on. Below is an example of what I would like to do.
Letter = c('A', 'A', 'A', 'B', 'C', 'C', 'D')
Value = c(11, 11, 11, 4, 8, 8, 10)
data.frame(Letter, Value)
Letter Value
1 A 11
2 A 11
3 A 11
4 B 4
5 C 8
6 C 8
7 D 10
And then have it become after the lag:
Lag = c(NA, NA, NA, 11, 4, 4, 8)
data.frame(Letter, Value, Lag)
Letter Value Lag
1 A 11 NA
2 A 11 NA
3 A 11 NA
4 B 4 11
5 C 8 4
6 C 8 4
7 D 10 8
(One thing to note is all values of the group will be the same)
CodePudding user response:
Get the unique rows of the data, lag Value and then left join the original data frame with that.
library(dplyr)
DF %>%
left_join(mutate(distinct(.), Lag = lag(Value), Value = NULL), by = "Letter")
giving:
Letter Value Lag
1 A 11 NA
2 A 11 NA
3 A 11 NA
4 B 4 11
5 C 8 4
6 C 8 4
7 D 10 8
CodePudding user response:
You can do the following (see below)
- We first group by LETTER and assign an id to each group member.
- Next we ungroup and assign the lag value if something is the first group member.
- And the final step is to fill the missings.
NOTE: all of this assumes your data set is sorted to your needs so that it would be correct to take the lag value from the last group.
library(tidyverse)
data.frame(Letter, Value) |>
group_by(Letter) |>
mutate(id = 1:n()) |>
ungroup() |>
mutate(Lag = ifelse(id == 1, lag(Value), NA)) |>
fill(Lag) |>
select(-id)
# A tibble: 7 × 3
Letter Value Lag
<chr> <dbl> <dbl>
1 A 11 NA
2 A 11 NA
3 A 11 NA
4 B 4 11
5 C 8 4
6 C 8 4
7 D 10 8