In R, how do I produce a data frame with rownames of every unique rowname of multiple input data fra-CodePudding

I have a series of data frames containing plant species names and associated percent covers from multiple quadrats and would like to streamline the task of generating a single data frame in which all species recorded in any quadrat are represented as columns and in which each row corresponds to a single quadrat. Currently, quadrat-specific data frames contain columns only for the species which were found to be present. I would like to integrate all quadrat-specific data frames into a single data frame in which absence of a species at a quadrat is denoted by FALSE or zero.

Objects a and b are dummy objects showing the format of my data, where a and b represent distinct quadrats. Object df shows the format of my objective, although it is here generated manually. "abund" represents abundance, and associated numbers are percent cover values. Objects spA through spI represent nine dummy plant species names, of which five are present in quadrat a and seven are present in quadrat b.

> a.names <- c("spA","spB","spC","spD","spE")
> a <- t(data.frame(c(40,20,10,10,10), row.names = a.names))
> row.names(a) <- "a.abund" 
> a
        spA spB spC spD spE
a.abund  40  20  10  10  10
> 
> b.names <- c("spC","spD","spE","spF","spG","spH","spI")
> b <- t(data.frame(c(40,10,10,10,10,10,10), row.names = b.names))
> row.names(b) <- "b.abund"
> b
        spC spD spE spF spG spH spI
b.abund  40  10  10  10  10  10  10
> 
> df.names <- c("spA","spB","spC","spD","spE","spF","spG","spH","spI")
> a.abund <- c(40,20,10,10,10,0,0,0,0)
> b.abund <- c(0,0,40,10,10,10,10,10,10)
> ( df <- t(data.frame(a.abund, b.abund, row.names = df.names)) )
        spA spB spC spD spE spF spG spH spI
a.abund  40  20  10  10  10   0   0   0   0
b.abund   0   0  40  10  10  10  10  10  10

CodePudding user response：

If we convert the matrices to data.frame first, we can use rbindlist from the data.table package:

a.names <- c("spA","spB","spC","spD","spE")
a <- t(data.frame(c(40,20,10,10,10), row.names = a.names))
row.names(a) <- "a.abund"
a

b.names <- c("spC","spD","spE","spF","spG","spH","spI")
b <- t(data.frame(c(40,10,10,10,10,10,10), row.names = b.names))
row.names(b) <- "b.abund"
b

a <- as.data.frame(a)
b <- as.data.frame(b)

df <- rbindlist(list(a, b), fill = TRUE)

df

> df
   spA spB spC spD spE spF spG spH spI
1:  40  20  10  10  10  NA  NA  NA  NA
2:  NA  NA  40  10  10  10  10  10  10

CodePudding user response：

Using base R

nms <- union(colnames(a) , colnames(b))
ans <- matrix(0, nrow = 2, ncol = length(nms))
colnames(ans) <- nms

d <- list(a,b)
for( i in 1:2){
    ans[i , match(colnames(d[[i]]) , colnames(ans))] <- d[[i]]
}

output


> ans
     spA spB spC spD spE spF spG spH spI
[1,]  40  20  10  10  10   0   0   0   0
[2,]   0   0  40  10  10  10  10  10  10