Add rows to data frame based on count-CodePudding

I have a data frame that looks like this:

Species_ID	Location_ID	Altitude	Female	Male
mon	WH	1700	3	10
jon	IF	1850	5	2
sylv	WS	2100	7	3
ter	MB	1700	20	15

I would like to have a total number of individuals (Female & Male) as an extra column
I would like to add rows to the data frame based on the total number of individuals, each row containing all info of the columns. So for example for the Species_ID "mon" we have a total number of 13 individuals. So i want 13 extra rows containing all infos of "Species_ID", "Location_ID" and "Altitude"

I pretty sure I can handle the first question by using mutate(), but I have absolutely no idea how to solve the second step.

CodePudding user response：

You can use uncount from tidyr. The optional argument .id creates a new variable which gives a unique identifier for each created row.

library(tidyr)

df %>%
  uncount(Female   Male, .id = "ID")

#    Species_ID Location_ID Altitude Female Male ID
# 1         mon          WH     1700      3   10  1
# 2         mon          WH     1700      3   10  2
# 3         mon          WH     1700      3   10  3
# 4         mon          WH     1700      3   10  4
# 5         mon          WH     1700      3   10  5
# 6         mon          WH     1700      3   10  6
# 7         mon          WH     1700      3   10  7
# 8         mon          WH     1700      3   10  8
# 9         mon          WH     1700      3   10  9
# 10        mon          WH     1700      3   10 10
# 11        mon          WH     1700      3   10 11
# 12        mon          WH     1700      3   10 12
# 13        mon          WH     1700      3   10 13
# ...

Data

df <- structure(
  list(Species_ID = c("mon", "jon", "sylv", "ter"),
       Location_ID = c("WH", "IF", "WS", "MB"),
       Altitude = c(1700L, 1850L, 2100L, 1700L), 
       Female = c(3L, 5L, 7L, 20L),
       Male = c(10L, 2L, 3L, 15L)),
  class = "data.frame", row.names = c(NA, -4L))

CodePudding user response：

Is this what you're looking for:

library(dplyr)
d <- tibble::tribble(
  ~Species_ID,  ~Location_ID,   ~Altitude,  ~Female,    ~Male, 
"mon",  "WH",   1700,   3,  10,
"jon",  "IF",   1850,   5,  2,
"sylv", "WS",   2100,   7,  3,
"ter",  "MB",   1700,   20, 15)


d <- d %>% 
  mutate(all_obs = Female   Male)

d[rep(1:nrow(d), d$all_obs), 1:3]
#> # A tibble: 65 × 3
#>    Species_ID Location_ID Altitude
#>    <chr>      <chr>          <dbl>
#>  1 mon        WH              1700
#>  2 mon        WH              1700
#>  3 mon        WH              1700
#>  4 mon        WH              1700
#>  5 mon        WH              1700
#>  6 mon        WH              1700
#>  7 mon        WH              1700
#>  8 mon        WH              1700
#>  9 mon        WH              1700
#> 10 mon        WH              1700
#> # … with 55 more rows

^{Created on 2023-01-17 by the reprex package (v2.0.1)}

CodePudding user response：

Ok, so I solved it like this:

b2 <- b1 %>% 
rowwise() %>% 
mutate(all_obs = sum(Weibchen,Arbeiterinnen,Männchen, na.rm=TRUE))
 %>% 
dplyr::select(Taxon_ID, Standort, Höhenstufe, all_obs)

b3 <- uncount(b2, all_obs, .remove=TRUE, .id="ID")