Home > OS >  Replace values of rows with missing values by values of another row
Replace values of rows with missing values by values of another row

Time:06-15

I’m trying to work with conditional but don’t find an easy way to do it. I have a dataset with missing value in column As, I want to create a new column C that takes the original values in A for all the rows without missing, and for row with missing value take the value from another column (column B). All columns are character variables.

A B
13 A 1 15 A 2
15 A 2 15 A 2
NA 15 A 8
10 B 3 15 A 2
NA 15 A 5

What i want is:

A B C
13 A 1 15 A 2 13 A 1
15 A 2 15 A 2 15 A 2
NA 15 A 8 15 A 8
10 B 3 15 A 2 10 B 3
NA 15 A 5 15 A 5

I tried with a loop but the result is not satisfactory,

for(i in 1:length(df$A)) {
  if(is.na(df$A[i])) {
    df$C <- df$B 
  }
  else {
    df$C<- df$A
  }
}

If anyone can help me, Thanks in advance

CodePudding user response:

In general, if you find yourself looping over a data frame, there is probably a more efficient solution, either to use vectorised functions like Jonathan has in his answer, or to use dplyr as follows.

We can check if a is NA - if so, we set c equal to b, otherwise keep it as a.

library(dplyr)
dat %>% mutate(c = if_else(is.na(A), B, A))
       A      B      c
1 13 A 1 15 A 2 13 A 1
2 15 A 2 15 A 2 15 A 2
3   <NA> 15 A 8 15 A 8
4 10 B 3 15 A 2 10 B 3
5   <NA> 15 A 5 15 A 5

CodePudding user response:

df$C <- ifelse(is.na(df$A), df$B, df$A)

CodePudding user response:

We could use coalesce:

library(dplyr)

df %>% 
  mutate(c = coalesce(A, B))
       A      B      c
1 13 A 1 15 A 2 13 A 1
2 15 A 2 15 A 2 15 A 2
3   <NA> 15 A 8 15 A 8
4 10 B 3 15 A 2 10 B 3
5   <NA> 15 A 5 15 A 5
  • Related