I am trying to use a for loop with a nested ifelse statement to generate an indicator variable in a dataframe. I'm fairly new to using for-loops however. Other questions I've found seem to be more complex than my dataset, so the answers haven't been ideal for my situation.
Essentially, I have survey recipients and names of their bosses, and I need to identify which recipients are also listed as bosses.
I have a vector of the boss names in which I know these names are also survey recipients. For example (names have been changed):
bossrecip<-c("Tamira Hughes", "John Legend", "Robert Collins")
Then the column that includes the recipients full name, which I cleaned to be formatted in the same way as the boss names, is column "RecipientFullName" in my SurveyData.
RecipientFullName<-c("Gosha Jennings", "Robert Stew", "John Legend")
both_recip_boss<-0
SurveyData<-data.frame(RecipientFullName, both_boss_recip)
"both_recip_boss" is where I would like to put a 1 for if the recipient is also a boss, and keep it as a 0 if they are just a recipient
The for-loop I have tried that I think I am the closest with is
for (b in bossrecip) {
ifelse(b==SurveyData$RecipientFullName | SurveyData$both_recip_boss==1,
SurveyData$both_recip_boss<-1,
SurveyData$both_recip_boss<-0)
}
I included the OR statement because I don't want the following names in b to overwrite the previous loop work. However, this just gives me one row with a 1, when I know there should be at least 91 ones in my full dataset. I'm sure I'm messing up something with the logic of for-loops, but I'm uncertain what it is.
I'd be very grateful for any advice and insight into what I am doing incorrectly. Thank you!
CodePudding user response:
No need for a loop. Using %in%
you could do:
SurveyData$both_recip_boss <- (SurveyData$RecipientFullName %in% bossrecip)
SurveyData
#> RecipientFullName both_recip_boss
#> 1 Gosha Jennings 0
#> 2 Robert Stew 0
#> 3 John Legend 1
CodePudding user response:
I rearranged the loop logic to show an approach.
However, R
is a vectorized language, and much of the benefits in computation speed and codability come from vectorizing code or using the internal loop replacement functions (such as the apply family)
bossrecip<-c("Tamira Hughes", "John Legend", "Robert Collins")
SurveyData<-data.frame(RecipientFullName=c("Gosha Jennings", "Robert Stew", "John Legend"),
both_boss_recip=0)
for (i in 1:nrow(SurveyData)){
SurveyData$both_boss_recip[i]<-ifelse(SurveyData$RecipientFullName[i] %in% bossrecip,
1,
0)
}
SurveyData
RecipientFullName both_boss_recip 1 Gosha Jennings 0 2 Robert Stew 0 3 John Legend 1