Home > front end >  Put last month's status label in a new column next to current label
Put last month's status label in a new column next to current label

Time:05-11

I’m attempting to write a script that compares labels between periods for a group of customers by creating a new column, “Last_month” that shows last period’s label. If it’s the first time a customer shows up in the dataset then it should either say NA or “New” in the “Last_month” column. I have tried a variety of things based on replies I received to a related question (Comparing rows to see if a customer has switched products) a couple of years ago but I’m not able to set the label to NA or “New” in the first month.

If I only had to compare the last two periods I’d just use a left join comparing the two periods but here I have a large dataset covering many periods. Is there an easy way to do this? My best attempt plus desired outcome:

library(tidyverse)

# Example data
df <- tibble(Period = c(202201, 202202, 202202, 202202, 202201, 202203,202203),
             CustomerID = c(1, 1, 2, 3, 2, 2, 1),
             Tag = c("A", "A", "B", "C", "D", "C", "D"))

# My best result
df %>% 
  arrange(CustomerID, Period) %>% 
  group_by(CustomerID) %>% 
  mutate(Last_month = lag(Tag, default = Tag[1]))

# Desired outcome 1
df2 <- tibble(Period = c(202201, 202202, 202203, 202201, 202202, 202203,202202),
              Customer = c(1, 1, 1, 2, 2, 2, 3),
              Tag = c("A", "A", "D", "D", "B", "C", "C"),
              Last_month = c("New", "A", "A", "New", "D", "B", "New"))

# Desired outcome 2
df3 <- tibble(Period = c(202201, 202202, 202203, 202201, 202202, 202203,202202),
             Customer = c(1, 1, 1, 2, 2, 2, 3),
             Tag = c("A", "A", "D", "D", "B", "C", "C"),
             Last_month = c(NA, "A", "A", NA, "D", "B", NA))

CodePudding user response:

like this?

library(dplyr)

df %>%
    arrange(CustomerID, Period) %>%
    group_by(CustomerID) %>%
    mutate(last_month = lag(Tag, 1))
  • Related