I have the following data frame:
Xnumber Number
X17339 EWY
X17339 LW2Y
X17401 EWC
X17401 LWY
X17466 EWC
X17466 LWY
X17466 EWY
X17466 LWC
I want to create a new column, Number2, using the following code:
library(dplyr
df3<-df3 %>% group_by(Xnumber) %>% mutate(Number2=if_else(lead(Number)=="LWC","Unknown",Number))
This is what I the resulting data frame should look like:
Xnumber Number Number2
X17339 EWY EWY
X17339 LW2Y LW2Y
X17401 EWC EWC
X17401 LWY LWY
X17466 EWC EWC
X17466 LWY LWY
X17466 EWY Unknown
X17466 LWC LWC
But instead, I also get NA's in my new column, like this.
Xnumber Number Number2
X17339 EWY EWY
X17339 LW2Y NA
X17401 EWC EWC
X17401 LWY NA
X17466 EWC EWC
X17466 LWY LWY
X17466 EWY Unknown
X17466 LWC NA
I'm not sure why this is happening. Any thoughts?
CodePudding user response:
Use default
:
library(dplyr)
df3<-df3 %>%
group_by(Xnumber) %>%
mutate(Number2=if_else(lead(Number, default = "") == "LWC","Unknown",Number))
CodePudding user response:
Since you grouped your data, lead
will return an NA at each group's end (no further in-group value ahead). If you want to replace these with, say, the most recent non-NA, {tidyr}'s fill
comes in handy. Example:
data.frame(x = c(1:3, NA, 5)) |>
tidyr::fill(x, .direction = 'down')