I have a dataframe called data_total that looks like that:
The dataframe is this Data Base.
Cuenca Temporada Alcalinidad Carb_ Bicarb_ Fluoruros Cloruros Nitratos Nitratos_N Sulfatos Nitritos Nitritos_N Fosfatos
<fct> <fct> <dbl> <dbl> <dbl> <chr> <dbl> <chr> <chr> <dbl> <chr> <chr> <chr>
1 Tumbes Avenida 283. 0.3 346. 0.495 572. <0.031 <0.007 901. <0.003 <0.001 <0.019
2 Zarumilla Avenida 146. 0.3 179. 0.2439999… 279. <0.031 <0.007 274. <0.003 <0.001 <0.019
3 Zarumilla Avenida 108. 0.3 132. 0.314 62.8 2.22900… 0.504 61.0 <0.003 <0.001 <0.019
4 Zarumilla Avenida 138. 0.3 168. 0.184 584. <0.031 <0.007 151. <0.003 <0.001 <0.019
5 Tumbes Avenida 156. 0.3 191. 0.2020000… 68.6 0.151 3.4000000… 124. <0.003 <0.001 <0.019
6 Tumbes Avenida 211 0.3 257. 0.157 64.9 1.04499… 0.2359999… 146. <0.003 <0.001 <0.019 ````
So I need to summarise how many values by column contain "<" symbol in the dataframe group by Cuenca, Temporada (in the future more factors) in all the others columns in the dataframe from Alcalinidad until the last column.
I know how to do it in one column, by example in Cd_dis column like this:
data_total %>% filter(str_detect(Cd_dis, "<")) %>%
group_by(Cuenca,Temporada) %>%
summarise(non_detect = n(), completed= nrow(conteo)-n())
The output is:
Cuenca Temporada non_detect completed
<fct> <fct> <int> <int>
1 Intercuenca 13951 Avenida 4 95
2 Tumbes Avenida 23 76
3 Zarumilla Avenida 13 86
But I need to create a code that runs in all the others columns and join the dataframes obtained for each variable.
Finally, I am creating a package to automatically works with dataframes that contain non-detected values in geology and complete by using Hesel methodology from the United State Geological Survey, so who help with these I am going to mentioned in the package for acknowledgment. Thank so much!!!
CodePudding user response:
A subset of the data.
data_total <- structure(list(Cuenca = c("Tumbes", "Zarumilla", "Zarumilla",
"Zarumilla", "Tumbes", "Tumbes", "Tumbes", "Tumbes", "Tumbes",
"Tumbes", "Tumbes", "Tumbes", "Tumbes", "Tumbes", "Tumbes", "Tumbes",
"Tumbes", "Tumbes", "Tumbes", "Zarumilla"), Temporada = c("Avenida",
"Avenida", "Avenida", "Avenida", "Avenida", "Avenida", "Avenida",
"Avenida", "Avenida", "Avenida", "Avenida", "Avenida", "Avenida",
"Avenida", "Avenida", "Avenida", "Avenida", "Avenida", "Avenida",
"Avenida"), Alcalinidad = c(283.2, 146.5, 108.3, 137.5, 156.3,
211, 277.9, 311.4, 44.9, 44.7, 59.1, 41.9, 39, 243.8, 421.9,
172.7, 93.2, 64.8, 325.8, 80.6), Carb_ = c(0.3, 0.3, 0.3, 0.3,
0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,
0.3, 0.3, 0.3), Bicarb_ = c(345.6, 178.7, 132.1, 167.7, 190.7,
257.4, 339, 380, 54.7, 54.5, 72.1, 51.1, 47.6, 297.5, 514.7,
210.7, 113.7, 79.1, 397.5, 98.3), Fluoruros = c(0.495, 0.244,
0.314, 0.184, 0.202, 0.157, 0.17, 0.09, 0.046, 0.053, 0.047,
0.068, 0.064, 0.347, 2.72, 0.466, 0.096, 0.09, 0.43, 0.233),
Cloruros = c(571.775, 279.242, 62.816, 583.6, 68.641, 64.948,
28.799, 19.77, 1.454, 1.51, 15.736, 2.394, 2.51, 209.913,
2855.5, 131.031, 32.365, 57.707, 662.67, 15.396), Nitratos = c("<0.031",
"<0.031", "2.229", "<0.031", "0.151", "1.045", "5.578", "0.567",
"1.768", "2.323", "1.899", "1.76", "2.088", "8.10E-02", "<0.031",
"2.656", "<0.031", "1.018", "1.439", "0.133"), Nitratos_N = c("<0.007",
"<0.007", "0.504", "<0.007", "3.40E-02", "0.236", "1.26",
"0.128", "0.4", "0.525", "0.429", "0.398", "0.472", "1.80E-02",
"<0.007", "0.6", "<0.007", "0.23", "0.325", "0.03"), Sulfatos = c(900.82,
274.02, 61.05, 151.11, 124.29, 145.78, 126.76, 19.41, 20.72,
18.59, 43.25, 23.35, 21.51, 281.15, 5144.7, 142.98, 31.21,
54.1, 1211.96, 7.36)), class = "data.frame", row.names = c(NA,
-20L))
Tidyverse answer.
library("tidyr")
library("dplyr")
data_total |>
pivot_longer(!1:2, values_transform=list(value=as.character)) |>
group_by(across(1:3)) |>
summarize(
non_detected=sum(str_detect(value, "^<")),
completed=nrow(data_total) - non_detected,
.groups="drop")
Results.
# A tibble: 16 × 5
Cuenca Temporada name non_detected completed
<chr> <chr> <chr> <int> <int>
1 Tumbes Avenida Alcalinidad 0 20
2 Tumbes Avenida Bicarb_ 0 20
3 Tumbes Avenida Carb_ 0 20
4 Tumbes Avenida Cloruros 0 20
5 Tumbes Avenida Fluoruros 0 20
6 Tumbes Avenida Nitratos 3 17
7 Tumbes Avenida Nitratos_N 3 17
8 Tumbes Avenida Sulfatos 0 20
9 Zarumilla Avenida Alcalinidad 0 20
10 Zarumilla Avenida Bicarb_ 0 20
11 Zarumilla Avenida Carb_ 0 20
12 Zarumilla Avenida Cloruros 0 20
13 Zarumilla Avenida Fluoruros 0 20
14 Zarumilla Avenida Nitratos 2 18
15 Zarumilla Avenida Nitratos_N 2 18
16 Zarumilla Avenida Sulfatos 0 20