I have a dataset with ~2500 columns in R, and I am trying to find the minimum value greater than zero from the entire data frame. Once I have found this number, I want to try adding it to all other data in the data frame.
Here is an example dataset.
df <- as.data.frame(rbind(c(0.01,0.4,0.5,5,27,0.22,25,0,0,0.3),c(0.25,0,0,2.3,3.6,6,0,0.001,0.021,0),c(22,23,0,40,0.53,0.2,0.4,0.44,0,0),c(0.1,0,0.12,0.56,0.7,13,0,0,3,4)))
As you can see, the minimum value in this dataframe is 0.001
I have tried the following:
min<- min(df=!0)
Which gives me 1.
I then want to add this number (0.001) to all other values in the data frame. However, I have no idea how to do this.
Thank you for your time considering my question. Any help would be greatly appreciated.
CodePudding user response:
Your syntax of df=!0
is incorrect. If you want to specify df
not equal to zero, the correct way is df!=0
. However, if you do it this way, the output would be logical, so the result of min(df != 0)
will be 0
.
Therefore, you should use the logical output to index your original df
.
min(df[df != 0])
[1] 0.001
Since your question stated that you would like to have min
values greater than zero, using df > 0
is more accurate.
min(df[df > 0])
[1] 0.001
CodePudding user response:
Another way is to unlist the data frame, and then find the minimum of its nonzero elements:
nonzero <- which(df != 0)
min(unlist(df)[nonzero])
[1] 0.001
CodePudding user response:
Here is another option using tidyverse
with min
. We can convert any of the zero values to NA
using map
, then convert the data to a vector, then pull the minimum value, excluding NA
.
library(tidyverse)
map(df, na_if, 0) %>%
flatten_dbl() %>%
min(na.rm = T)
# [1] 0.001