I have the 2 following separate data frames:
itens | sp1 | sp2 |
---|---|---|
item1 | 20 | 10 |
item2 | 30 | 15 |
item3 | 30 | 15 |
item4 | 30 | 15 |
and
itens | sp5 | sp6 |
---|---|---|
item7 | 20 | 10 |
item8 | 30 | 15 |
item9 | 30 | 15 |
How could I create, in R, a for function that produce the following result?
sps | sum | N |
---|---|---|
sp1 | 110 | 4 |
sp2 | 55 | 4 |
sp5 | 80 | 3 |
sp6 | 40 | 3 |
Where each N and sum columns , in this new data frame, are number and sum of itens of each species (sps).
Thank you
CodePudding user response:
df1 <- structure(list(itens = c("item1", "item2", "item3", "item4"),
sp1 = c(20L, 30L, 30L, 30L),
sp2 = c(10L, 15L, 15L, 15L)),
class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(itens = c("item7", "item8", "item9"),
sp5 = c(20L, 30L, 30L),
sp6 = c(10L, 15L, 15L)),
class = "data.frame", row.names = c(NA, -3L))
Solution not using a loop but tidyr::pivot_longer
and dplyr::full_join
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(-itens) %>%
full_join(df2 %>% pivot_longer(-itens)) %>%
group_by(sps = name) %>%
summarise(N = n(),
sum = sum(value))
Returns:
sps N sum
<chr> <int> <int>
1 sp1 4 110
2 sp2 4 55
3 sp5 3 80
4 sp6 3 40
CodePudding user response:
Another solution, based on and dplyr::bind_rows
and tidyr::pivot_longer
:
library(tidyverse)
df1 <- data.frame(
stringsAsFactors = FALSE,
itens = c("item1", "item2", "item3", "item4"),
sp1 = c(20L, 30L, 30L, 30L),
sp2 = c(10L, 15L, 15L, 15L)
)
df2 <- data.frame(
stringsAsFactors = FALSE,
itens = c("item7", "item8", "item9"),
sp5 = c(20L, 30L, 30L),
sp6 = c(10L, 15L, 15L)
)
df1 %>%
bind_rows(df2) %>%
pivot_longer(-itens, names_to = "sps") %>%
group_by(sps) %>%
summarise(sum = sum(value,na.rm = T), N = sum(!is.na(value)))
#> # A tibble: 4 × 3
#> sps sum N
#> <chr> <int> <int>
#> 1 sp1 110 4
#> 2 sp2 55 4
#> 3 sp5 80 3
#> 4 sp6 40 3