I am trying to figure out how to update various records in df1
column A
with new values from df2
column A
. df1
has 2141 column A
observations with a unique ID in theIndex
column. df2
has 268 updated column A
values with their associated unique ID in the Index
column. I tried using merge()
, or even a simple for loop with an if statement with no luck, such as:
for (i in 1:nrow(df1)){
if (df1$Index[i] == df2$Index[i]){
df1$A[i] <- df2$A[1]
}
}
A simplified example of my two data frames and what result I'm trying to achieve:
df1:
Index A
1 1 NA
2 2 NA
3 3 NA
4 4 NA
5 5 NA
6 6 NA
7 7 NA
8 8 NA
9 9 NA
10 10 NA
df2:
Index A
1 2 85
2 3 46
3 6 79
4 7 64
5 10 40
Updated df1:
Index A
1 1 NA
2 2 85
3 3 46
4 4 NA
5 5 NA
6 6 79
7 7 64
8 8 NA
9 9 NA
10 10 40
I have to believe this is simple, but I can't figure out how to update my main data frame. My search for ideas online keeps going back to the merge()
function or similar join functions. Thank you for any help or guidance.
CodePudding user response:
Try this
df <- data.frame(Index = 1:10 , A = NA)
for(i in 1:nrow(df)){
x <- which(i == df2$Index)
y <- which(i == df1$Index)
if(length(x) > 0) df$A[i] <- df2$A[x]
else if(length(y) > 0) df$A[i] <- df1$A[y]
else df$A[i] <- NA
}
- output
Index A
1 1 NA
2 2 85
3 3 46
4 4 NA
5 5 NA
6 6 79
7 7 64
8 8 NA
9 9 NA
10 10 40
CodePudding user response:
In dplyr you can do:
df2 %>%
full_join(df1) %>%
group_by(Index) %>%
filter(ifelse((length(row_number()) > 1) & is.na(A), 1, 2) == 2) %>%
ungroup() %>%
arrange(-desc(Index))
Assuming that there is only a maximum of one unique Index value per df
# A tibble: 10 × 2
Index A
<int> <int>
1 1 NA
2 2 85
3 3 46
4 4 NA
5 5 NA
6 6 79
7 7 64
8 8 NA
9 9 NA
10 10 40