I have a data frame that looks like this:
Species_ID | Location_ID | Altitude | Female | Male |
---|---|---|---|---|
mon | WH | 1700 | 3 | 10 |
jon | IF | 1850 | 5 | 2 |
sylv | WS | 2100 | 7 | 3 |
ter | MB | 1700 | 20 | 15 |
I would like to have a total number of individuals (Female & Male) as an extra column
I would like to add rows to the data frame based on the total number of individuals, each row containing all info of the columns. So for example for the Species_ID "mon" we have a total number of 13 individuals. So i want 13 extra rows containing all infos of "Species_ID", "Location_ID" and "Altitude"
I pretty sure I can handle the first question by using mutate(), but I have absolutely no idea how to solve the second step.
CodePudding user response:
You can use uncount
from tidyr
. The optional argument .id
creates a new variable which gives a unique identifier for each created row.
library(tidyr)
df %>%
uncount(Female Male, .id = "ID")
# Species_ID Location_ID Altitude Female Male ID
# 1 mon WH 1700 3 10 1
# 2 mon WH 1700 3 10 2
# 3 mon WH 1700 3 10 3
# 4 mon WH 1700 3 10 4
# 5 mon WH 1700 3 10 5
# 6 mon WH 1700 3 10 6
# 7 mon WH 1700 3 10 7
# 8 mon WH 1700 3 10 8
# 9 mon WH 1700 3 10 9
# 10 mon WH 1700 3 10 10
# 11 mon WH 1700 3 10 11
# 12 mon WH 1700 3 10 12
# 13 mon WH 1700 3 10 13
# ...
Data
df <- structure(
list(Species_ID = c("mon", "jon", "sylv", "ter"),
Location_ID = c("WH", "IF", "WS", "MB"),
Altitude = c(1700L, 1850L, 2100L, 1700L),
Female = c(3L, 5L, 7L, 20L),
Male = c(10L, 2L, 3L, 15L)),
class = "data.frame", row.names = c(NA, -4L))
CodePudding user response:
Is this what you're looking for:
library(dplyr)
d <- tibble::tribble(
~Species_ID, ~Location_ID, ~Altitude, ~Female, ~Male,
"mon", "WH", 1700, 3, 10,
"jon", "IF", 1850, 5, 2,
"sylv", "WS", 2100, 7, 3,
"ter", "MB", 1700, 20, 15)
d <- d %>%
mutate(all_obs = Female Male)
d[rep(1:nrow(d), d$all_obs), 1:3]
#> # A tibble: 65 × 3
#> Species_ID Location_ID Altitude
#> <chr> <chr> <dbl>
#> 1 mon WH 1700
#> 2 mon WH 1700
#> 3 mon WH 1700
#> 4 mon WH 1700
#> 5 mon WH 1700
#> 6 mon WH 1700
#> 7 mon WH 1700
#> 8 mon WH 1700
#> 9 mon WH 1700
#> 10 mon WH 1700
#> # … with 55 more rows
Created on 2023-01-17 by the reprex package (v2.0.1)
CodePudding user response:
Ok, so I solved it like this:
b2 <- b1 %>%
rowwise() %>%
mutate(all_obs = sum(Weibchen,Arbeiterinnen,Männchen, na.rm=TRUE))
%>%
dplyr::select(Taxon_ID, Standort, Höhenstufe, all_obs)
b3 <- uncount(b2, all_obs, .remove=TRUE, .id="ID")