I would like to update one column based on 2 columns
My example dataframe contains 3 columns
df <- data.frame(n1 = c(1,2,1,2,5,6),
n2 = c("a", "a", "a", NA, "b", "c"),
n3 = c("red", "red", NA, NA, NA, NA))
df
n1 n2 n3
1 1 a red
2 2 a red
3 1 a <NA>
4 2 <NA> <NA>
5 5 b <NA>
6 6 c <NA>
I would like to add red
name to row number 3
and 4
with the condition is that if values of n1
(i.e. 1,2) match with n2
(i.e. a
), even though the fourth row (n1
not match n2
).
The main point is if n2
== a
, and values of n1
associated with a
, then values of n3
that are the same row with values of n1
should be added with red
.
My desired output
n1 n2 n3
1 1 a red
2 2 a red
3 1 a red
4 2 <NA> red
5 5 b <NA>
6 6 c <NA>
Any sugesstions for this case? I hope my explanation is clear enough. Since my data is very long, I am trying to find a good to handle it.
CodePudding user response:
In base R
, create a logical vector to subset the rows of 'df' based on the unique
values of 'n1' where 'n2' is "a", then do the assignment of 'n3' corresponding to that elements with the first non-NA element from 'n3'
i1 <- with(df, n1 %in% unique(n1[n2 %in% 'a']))
df$n3[i1] <- na.omit(df$n3[i1])[1]
-output
> df
n1 n2 n3
1 1 a red
2 2 a red
3 1 a red
4 2 <NA> red
5 5 b <NA>
6 6 c <NA>
CodePudding user response:
Update:
df %>%
mutate(group = rep(row_number(), each=2, length.out = n())) %>%
group_by(group) %>%
mutate(n3 = ifelse(n1 %in% c(1,2) & any(n2 %in% "a", na.rm = TRUE), "red", n3)) %>%
ungroup() %>%
select(-group)
We could use an ifelse
statement with conditions defined using any
.
library(dplyr)
df %>%
mutate(n3 = ifelse(n1==1 | n1==2 & any(n2[3:4] %in% "a"), "red", n3))
n1 n2 n3
1 1 a red
2 2 a red
3 1 a red
4 2 <NA> red
5 5 b <NA>
6 6 c <NA>
CodePudding user response:
library(dplyr)
library(tidyr)
df %>%
group_by(n1) %>%
fill(n3) %>%
group_by(n2) %>%
fill(n3)
# # A tibble: 6 × 3
# # Groups: n2 [4]
# n1 n2 n3
# <dbl> <chr> <chr>
# 1 1 a red
# 2 2 a red
# 3 1 a red
# 4 2 NA red
# 5 5 b NA
# 6 6 c NA