Home > Mobile >  Populate a data frame based on another one
Populate a data frame based on another one

Time:10-05

I have a data frame (DF1) that I want to fill with values from another data frame (DF2) if the columns names match. If there's NA, or 0 I don't want to change DF1 columns but if theres 1 or 2, I'd like to modify with DF2 corresponding value.

Example: df1

sampleID ID1 ID2 ID3
A 0 NA 0
B 1 0 1
C 1 2 0

DF2

V1 V2
ID1 0.5
ID2 0.7
ID3 0.9

Desired result:

sampleID ID1 ID2 ID3
A 0 NA 0
B 0.5 0 0.9
C 0.5 0.7 0

If there's a NA value, I don't want any change. I've worked on an apply() custom function based solution, but it's not working too well. Interested by any idea you might have. Also dplyr solutions are welcome!

CodePudding user response:

In base R, you can use sapply replace. Note that this works because IDs in df1 and df2 are in the same order.

df1[-1] <- sapply(2:ncol(df1), 
       function(x) replace(df1[x], df1[x] != 0 & !is.na(df1[x]), df2[x-1, 2]))

#   sampleID ID1 ID2 ID3
# 1        A 0.0  NA 0.0
# 2        B 0.5 0.0 0.9
# 3        C 0.5 0.7 0.0

Or an equivalent solution with match (this one does not rely on ordering):

sapply(2:ncol(df1), 
       function(x) replace(df1[x], 
                           df1[x] != 0 & !is.na(df1[x]), 
                           df2$V2[match(names(df1[x]), df2$V1)]))

CodePudding user response:

# Create a named vector
lut = setNames(df2$V2, df2$V1)
cols = intersect(names(lut), names(df1))
# [1] "ID1" "ID2" "ID3"

df1[cols] <- 
  lapply(cols, \(x) replace(df1[[x]], !(is.na(df1[[x]]) | df1[[x]] == 0), lut[x]))

#   sampleID ID1 ID2 ID3
# 1        A 0.0  NA 0.0
# 2        B 0.5 0.0 0.9
# 3        C 0.5 0.7 0.0

Data

df1 <- data.frame(sampleID = c("A", "B", "C"), ID1 = c(0L, 1L, 1L ), ID2 = c(NA, 0L, 2L), ID3 = c(0L, 1L, 0L))
df2 <- data.frame(V1 = c("ID1", "ID2", "ID3"), V2 = c(0.5, 0.7, 0.9))

CodePudding user response:

One way to solve your problem:

df3 = df1
df3[df2$V1] = with(df2, Map(\(x,y) ifelse(x %in% c(0,NA), x, V2[V1==y]), df1[V1], V1))

  sampleID ID1 ID2 ID3
1        A 0.5  NA 0.9
2        B 1.0 0.7 1.0
3        C 1.0 0.7 0.9
  • Related