Home > Back-end >  Marking the X previous occurences in R
Marking the X previous occurences in R

Time:09-26

I have a dataset with many different social media creators (creator_id). They posted many times (posting_count) and the posts are classified as ad if ad = 1. Now I always want to classify the 3 previous postings before ad = 1 as 1. Basically the "goal_variable" is what I want to get. A solution without a loop would be cool!!

creator_id <-c("aaa","aaa","aaa","aaa","aaa","aaa","aaa","aaa","bbb","bbb","bbb","bbb","bbb","bbb","bbb","bbb","bbb") 
posting_count <- c(143,144,145,146,147,148,149,150,90,91,92,93,94,95,96,97,98) 
ad <- c(0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1) 
goal_variable <- c(0,0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,0)
df <- cbind(creator_id, posting_count, ad, goal_variable)

CodePudding user response:

First, a cleaner way to make df, without the intermediate variables.

We can use ifelse with multiple | (or) statements. Here each lead does the following: Lead([variable], [n to look ahead], [default value(0)]).

df <- data.frame(creator_id =c("aaa","aaa","aaa","aaa","aaa","aaa","aaa","aaa","bbb","bbb","bbb","bbb","bbb","bbb","bbb","bbb","bbb"), 
                 posting_count = c(143,144,145,146,147,148,149,150,90,91,92,93,94,95,96,97,98),
                 ad = c(0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1),
                 goal_variable = c(0,0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,0))

library(dplyr)        
df %>% 
  group_by(creator_id) %>%
  mutate(new_goal=ifelse(lead(ad,1,0)==1|lead(ad,2,0)==1|lead(ad,3,0)==1,1,0))

CodePudding user response:

An option with slider

library(slider)
library(dplyr)
df %>% 
   group_by(creator_id) %>%
    mutate(goal_variable2 = lead( (slide_int(ad, \(x) 1 %in% x, 
      .after = 2)), default = 0)) %>% 
   ungroup

-output

# A tibble: 17 × 5
   creator_id posting_count    ad goal_variable goal_variable2
   <chr>              <dbl> <dbl>         <dbl>          <dbl>
 1 aaa                  143     0             0              0
 2 aaa                  144     0             0              0
 3 aaa                  145     0             0              0
 4 aaa                  146     0             1              1
 5 aaa                  147     0             1              1
 6 aaa                  148     0             1              1
 7 aaa                  149     1             0              0
 8 aaa                  150     0             0              0
 9 bbb                   90     0             0              0
10 bbb                   91     0             0              0
11 bbb                   92     0             0              0
12 bbb                   93     0             1              1
13 bbb                   94     0             1              1
14 bbb                   95     0             1              1
15 bbb                   96     1             1              1
16 bbb                   97     0             1              1
17 bbb                   98     1             0              0

CodePudding user response:

Here's a programmatical way using map. Basically, for each row, checking whether the current row is between 3 and 1 positions before the closest ad == 1.

library(purrr)
library(dplyr)
df %>% 
  group_by(creator_id) %>% 
  mutate(goal_variable = map_int(row_number(), ~ any((.x - which(ad == 1)) %in% -3:-1)))

output

# A tibble: 17 × 4
# Groups:   creator_id [2]
   creator_id posting_count    ad goal_variable
   <chr>              <dbl> <dbl>         <int>
 1 aaa                  143     0             0
 2 aaa                  144     0             0
 3 aaa                  145     0             0
 4 aaa                  146     0             1
 5 aaa                  147     0             1
 6 aaa                  148     0             1
 7 aaa                  149     1             0
 8 aaa                  150     0             0
 9 bbb                   90     0             0
10 bbb                   91     0             0
11 bbb                   92     0             0
12 bbb                   93     0             1
13 bbb                   94     0             1
14 bbb                   95     0             1
15 bbb                   96     1             1
16 bbb                   97     0             1
17 bbb                   98     1             0
  • Related