Home > Software design >  Reference column name from another table to insert value from shared ID
Reference column name from another table to insert value from shared ID

Time:10-28

I am attempting to reference a column in another dataframe (df2) based on a column of column names in my primary dataframe (df1$reference) to return the extact value where the ID and column name match in a new column (df1$new_col). Here is a simplified example of my data and the desired outcome:

enter image description here

I have tried rbind but ran into errors due to differences in the number of rows/columns. I also tried bind_rows/bind_cols, but am having a hard time joining only the referenced data (I would like to avoid a large join because my real data has many more columns). I feel like indexing would be able to accomplish this but I am not as familiar with indexing outside of simple tasks. I'm open to any and all suggestions / approaches!

CodePudding user response:

We may use row/column indexing

DF1$new_col <- DF2[-1][cbind(seq_len(nrow(DF1)), 
        match(DF1$reference, names(DF2)[-1]))]

-output

> DF1
  ID value reference new_col
1  1     4      colD      no
2  2     5      colD      no
3  3     6      colE      no

data


DF1 <- structure(list(ID = 1:3, value = 4:6, reference = c("colD", "colD", 
"colE")), class = "data.frame", row.names = c(NA, -3L))

DF2 <- structure(list(ID = 1:3, colD = c("no", "no", "yes"), colE = c("yes", 
"no", "no"), colF = c("no", "yes", "no")), 
class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response:

Maybe something like this?

library(dplyr)
left_join(DF1, DF2, by="ID") %>% 
  mutate(New_col = case_when(reference=="colD" ~ colD,
                             reference=="colE" ~ colE,
                             reference=="colF" ~ colF)) %>% 
  select(ID, value, reference, New_col)
  ID value reference New_col
1  1     4      colD      no
2  2     5      colD      no
3  3     6      colE      no
  • Related