I have the following data:
df <- structure(list(automatic = c("organismo", "bolha", "organismo",
"organismo", "cosc_multiplo", "cosc_multiplo", "coscinodiscus",
"detritos", "mult_organismos", "multiplos", "organismo", "sombra",
"detritos", "mult_organismos", "detritos", "mult_organismos",
"detritos", "org_partes", "detritos", "organismo", "organismo",
"detritos", "organismo", "organismo", "organismo", "bolha", "coral_falso",
"coscinodiscus", "detritos", "LRaw", "multiplos", "organismo",
"sombra"), validated = c("appendicularia", "bolha", "cnidaria",
"copepodo", "cosc_multiplo", "coscinodiscus", "coscinodiscus",
"coscinodiscus", "coscinodiscus", "coscinodiscus", "coscinodiscus",
"coscinodiscus", "detritos", "detritos", "langanho", "mult_organismos",
"multiplos", "org_partes", "organismo", "organismo", "palmeria",
"pelotas_mix", "phyto", "phyto_cadeia", "phyto_espiral", "sombra",
"sombra", "sombra", "sombra", "sombra", "sombra", "sombra", "sombra"
), N = c(2L, 1L, 2L, 1L, 2L, 1L, 1229L, 3L, 2L, 4L, 5L, 57L,
1569L, 1L, 87L, 31L, 1L, 7L, 1L, 75L, 2L, 11L, 4L, 1L, 1L, 1L,
10L, 25L, 536L, 25L, 30L, 562L, 3678L)), row.names = c(NA, -33L
), class = c("tbl_df", "tbl", "data.frame"))
I would to shown all combinations in columns automatic and validated.
For example, I hadn't the combination: bolha (in the automatic
column) with appendicularia (in the validated
column). I would like to show this combination, and the all other's absents, with a value of 0 in column N
.
Where are combinations it has to maintain their value in N column. Like bolha (in automatic column) with bolha (in validated column) has a value in N
of 1, it does not have to change.
Thanks all
CodePudding user response:
If you want to get all unique combinations and maintain the original values for N
, then you can first use crossing
from tidyr
to get all unique combinations. Then, we can do a left join to add in the N
values from the original dataframe, and finally change NA
to 0 for N
.
library(tidyverse)
left_join(crossing(automatic = df$automatic, validated = df$validated),
df,
by = c("automatic", "validated")) %>%
replace_na(list(N = 0))
Or a shorter option is to simply use rows_update
instead of doing a join:
crossing(automatic = df$automatic, validated = df$validated, N = 0) %>%
rows_update(df, by = c("automatic", "validated"))
Output
# A tibble: 198 × 3
automatic validated N
<chr> <chr> <int>
1 bolha appendicularia 0
2 bolha bolha 1
3 bolha cnidaria 0
4 bolha copepodo 0
5 bolha cosc_multiplo 0
6 bolha coscinodiscus 0
7 bolha detritos 0
8 bolha langanho 0
9 bolha mult_organismos 0
10 bolha multiplos 0
# … with 188 more rows
CodePudding user response:
Here is an approach using expand.grid
-> similar to @AndrewGB s solution:
library(dplyr)
expand_grid(automatic=df$automatic, validated=df$validated, N=0) %>%
rows_update(df, by = c("automatic", "validated")) %>%
distinct() %>%
arrange(automatic)
automatic validated N
<chr> <chr> <dbl>
1 bolha appendicularia 0
2 bolha bolha 1
3 bolha cnidaria 0
4 bolha copepodo 0
5 bolha cosc_multiplo 0
6 bolha coscinodiscus 0
7 bolha detritos 0
8 bolha langanho 0
9 bolha mult_organismos 0
10 bolha multiplos 0
# … with 188 more rows
CodePudding user response:
There is also complete
which is a wrapper around expand and join
df |>
complete(automatic, validated, fill = list(N = 0))
automatic validated N
<chr> <chr> <int>
1 bolha appendicularia 0
2 bolha bolha 1
3 bolha cnidaria 0
4 bolha copepodo 0
5 bolha cosc_multiplo 0
6 bolha coscinodiscus 0
7 bolha detritos 0
8 bolha langanho 0
9 bolha mult_organismos 0
10 bolha multiplos 0
# … with 188 more rows
If you want a unique combination whereby there is only one combination of automatic and validated when sorted. Then in dplyr you can do
df |>
complete(automatic, validated, fill = list(N = 0)) |>
rowwise() |>
mutate(m = paste(sort(c(validated, automatic)), collapse = ", ")) |>
group_by(m) |>
filter(N == max(N)) |>
slice(1) |>
ungroup() |>
mutate(m = NULL)
# A tibble: 162 × 3
automatic validated N
<chr> <chr> <int>
1 bolha appendicularia 0
2 coral_falso appendicularia 0
3 cosc_multiplo appendicularia 0
4 coscinodiscus appendicularia 0
5 detritos appendicularia 0
6 LRaw appendicularia 0
7 mult_organismos appendicularia 0
8 multiplos appendicularia 0
9 org_partes appendicularia 0
10 organismo appendicularia 2