I have some data frames, as follows
jack<-data.frame(c("M","F"),c(12,22))
colnames(jack)[1]<-"sex"
colnames(jack)[2]<-"age"
ethan<-data.frame(c("M","F"),c(34,99))
colnames(ethan)[1]<-"sex"
colnames(ethan)[2]<-"age"
ellie<-data.frame(c("M","F"),c(9,35))
colnames(ellie)[1]<-"sex"
colnames(ellie)[2]<-"age"
And a final data frame
mydf<-as.data.frame(c("martha","jack","luis","ethan","ellie"))
colnames(mydf)[1]<-"name"
I want to create a data frame with the value from the column sex
of each data frame (jack
, ethan
, ellie
), matching the name
column of mydf
(if there's a match - otherwise, put a 0). The final result should look like this
name sexM sexF
1 martha 0 0
2 jack 12 22
3 luis 0 0
4 ethan 34 99
5 ellie 9 35
I cannot understand how to combine these data frames. I used merge
in the past, but this time I don't want to match row or column names, but the data frame (jack
) with its matching row in mydf
and insert its age
value in the column.
CodePudding user response:
We can use mget
on the intersect
ing objects created in the global environment with the names
from 'mydf' to return a list
of key/values, then bind the list elements with bind_rows
, reshape to 'wide' format with pivot_wider
and later join with the original 'mydf'
library(dplyr)
library(tidyr)
mget(intersect(ls(), mydf$name)) %>%
bind_rows(.id = 'name') %>%
pivot_wider(names_from = 'sex', values_from = 'age') %>%
right_join(mydf) %>%
mutate(across(-name, replace_na, 0)) %>%
arrange(match(name, mydf$name))
-output
# A tibble: 5 × 3
name M F
<chr> <dbl> <dbl>
1 martha 0 0
2 jack 12 22
3 luis 0 0
4 ethan 34 99
5 ellie 9 35
Or may also use a for
loop in base R
by looping over the sequence of 'name', check if the name
element exists
as object in the global env, then assign the corresponding 'age' values for 'M', 'F' for already created 'sexM', 'sexF' (as 0 values)
mydf[c("sexM", "sexF")] <- 0
for(i in seq_along(mydf$name)) if(exists(mydf$name[i])) {
tmp <- get(mydf$name[i])
mydf$sexM[i] <- tmp$age[tmp$sex == "M"]
mydf$sexF[i] <- tmp$age[tmp$sex == "F"]
}
-output
> mydf
name sexM sexF
1 martha 0 0
2 jack 12 22
3 luis 0 0
4 ethan 34 99
5 ellie 9 35