Home > Back-end >  Create a data frame with values column names matching another data frame's name
Create a data frame with values column names matching another data frame's name

Time:07-17

I have some data frames, as follows

jack<-data.frame(c("M","F"),c(12,22))
colnames(jack)[1]<-"sex"
colnames(jack)[2]<-"age"

ethan<-data.frame(c("M","F"),c(34,99))
colnames(ethan)[1]<-"sex"
colnames(ethan)[2]<-"age"

ellie<-data.frame(c("M","F"),c(9,35))
colnames(ellie)[1]<-"sex"
colnames(ellie)[2]<-"age"

And a final data frame

mydf<-as.data.frame(c("martha","jack","luis","ethan","ellie"))
colnames(mydf)[1]<-"name"

I want to create a data frame with the value from the column sex of each data frame (jack, ethan, ellie), matching the name column of mydf (if there's a match - otherwise, put a 0). The final result should look like this

name sexM sexF
1 martha    0    0
2   jack   12   22
3   luis    0    0
4  ethan   34   99
5  ellie    9   35

I cannot understand how to combine these data frames. I used merge in the past, but this time I don't want to match row or column names, but the data frame (jack) with its matching row in mydf and insert its age value in the column.

CodePudding user response:

We can use mget on the intersecting objects created in the global environment with the names from 'mydf' to return a list of key/values, then bind the list elements with bind_rows, reshape to 'wide' format with pivot_wider and later join with the original 'mydf'

library(dplyr)
library(tidyr)
mget(intersect(ls(), mydf$name)) %>% 
  bind_rows(.id = 'name') %>% 
  pivot_wider(names_from = 'sex', values_from = 'age') %>% 
  right_join(mydf) %>%
  mutate(across(-name, replace_na, 0)) %>%
  arrange(match(name, mydf$name))

-output

# A tibble: 5 × 3
  name       M     F
  <chr>  <dbl> <dbl>
1 martha     0     0
2 jack      12    22
3 luis       0     0
4 ethan     34    99
5 ellie      9    35

Or may also use a for loop in base R by looping over the sequence of 'name', check if the name element exists as object in the global env, then assign the corresponding 'age' values for 'M', 'F' for already created 'sexM', 'sexF' (as 0 values)

mydf[c("sexM", "sexF")] <- 0

for(i in seq_along(mydf$name)) if(exists(mydf$name[i])) {
   tmp <- get(mydf$name[i])
   mydf$sexM[i] <- tmp$age[tmp$sex == "M"]
   mydf$sexF[i] <- tmp$age[tmp$sex == "F"]
}

-output

> mydf
    name sexM sexF
1 martha    0    0
2   jack   12   22
3   luis    0    0
4  ethan   34   99
5  ellie    9   35
  • Related