I have a dataframe A with the following columns
SN Sample1 Sample2
Sample 1 and 2 either have numeric values or some text to denote that no sampling was possible.
I need to keep any row that has at least one numeric value
.
My idea was to filter out rows based on having no numeric values
.
I normally use this: A[!is.na(as.numeric(A$sample1)), ]
but this only looks at one of the columns.
I need help to write this out where it looks at Sample1 and Sample2.
Basically, what I need done is
Sample 1 text Sample 2 text #then remove
Sample 1 numeric Sample 2 numeric #then keep
Sample 1 numeric Sample 2 text #then keep
Sample 1 text Sample 2 numeric #then keep
CodePudding user response:
In base R, you can use grepl
to search for digits then create two logicals and index with the "or" operator, |
:
df[grepl("^-?\\d \\.?\\d*$", df$a) | grepl("^-?\\d \\.?\\d*$", df$b), ] #thanks @zephyryl
# a b
# 1 1 A
# 2 2 B
Original Sample Data:
df <- data.frame(a = c(1, 2, "Abc"),
b = c(LETTERS[1:3]))
# a b
# 1 1 A
# 2 2 B
# 3 Abc C
CodePudding user response:
You could make your existing code into a function (also suppressing the "NAs introduced by coercion warning" and handling NA
s in the original vector), then apply rowwise using apply()
:
is_coercible_numeric <- function(x) {
is.na(x) | !is.na(suppressWarnings(as.numeric(x)))
}
A[apply(dat, 1, \(col) any(is_coercible_numeric(col))), ]
# Sample1 Sample2
# 2 2 2
# 3 3 text3
# 4 text4 4
Or using dplyr:
library(dplyr)
A %>%
filter(if_any(Sample1:Sample2, is_coercible_numeric))
# Sample1 Sample2
# 1 2 2
# 2 3 text3
# 3 text4 4
Example data:
A <- data.frame(
Sample1 = c("text1", 2, 3, "text4"),
Sample2 = c("text1", 2, "text3", 4)
)