I have a classification problem I need to solve using R, but to be sincere I have no clue on how to do it.
I have a table (see below) where different samples are classified by three ML models (one per column), and I need to choose the "most voted" category for each case and write it to a new column.
Current table
Desired Output
I have been reading about categorical variables in R, but anything seem to fit my specific needs.
Any help would be highly appreciated.
Thanks in advance.
JL
CodePudding user response:
This is not how you ask a question. Please see the relevant thread, and in the future offer the data in the form shown below (using dput()
and copy and paste the result from the console). At any rate here is a base R solution:
# Calculate the modal values: mode => character vector
df1$mode <- apply(
df1[,colnames(df1) != "samples"],
1,
function(x){
head(
names(
sort(
table(x),
decreasing = TRUE
)
),
1
)
}
)
Data:
df1 <- structure(list(samples = c("S1", "D4", "S2", "D1", "D2", "S3",
"D3", "S4"), RFpred = c("Carrier", "Absent", "Helper", "Helper",
"Carrier", "Absent", "Resistant", "Carrier"), SVMpred = c("Absent",
"Absent", "Helper", "Helper", "Carrier", "Helper", "Helper",
"Resistant"), KNNpred = c("Carrier", "Absent", "Carrier", "Helper",
"Carrier", "Absent", "Helper", "Resistant"), mode = c("Carrier",
"Absent", "Helper", "Helper", "Carrier", "Absent", "Helper",
"Resistant")), row.names = c(NA, -8L), class = "data.frame")
CodePudding user response:
Tidyverse Approach:
library(dplyr)
library(tibble)
mode_char <- function(x) {
ux <- unique(na.omit(x))
ux[which.max(tabulate(match(x, ux)))]
}
df %>%
as_tibble() %>%
rowwise() %>%
mutate(
Vote = mode_char(c_across(RFpred:KNNpred))
)
#> # A tibble: 8 × 5
#> # Rowwise:
#> samples RFpred SVMpred KNNpred Vote
#> <chr> <chr> <chr> <chr> <chr>
#> 1 S1 Carrier Absent Carrier Carrier
#> 2 D4 Absent Absent Absent Absent
#> 3 S2 Helper Helper Carrier Helper
#> 4 D1 Helper Helper Helper Helper
#> 5 D2 Carrier Carrier Carrier Carrier
#> 6 S3 Absent Helper Absent Absent
#> 7 D3 Resistant Helper Helper Helper
#> 8 S4 Carrier Resistant Resistant Resistant