How to log(x 1) a data frame or matrix in R-CodePudding

I am new to coding and am doing some gene expression analysis. I have a very naïve question. I have a a few gene expression data frames with gene names as rows and cell names as columnsExample gene exp. data frame. I want to log2 transform the data, but am confused between log and log 1. how do perform log2 1 (log(x 1)) transformation of a dataframe in R? is it same as log2 transformation? Should I do t=log(v 1) ? Any help will be appreciated.

CodePudding user response：

for example dummy data

dummy <- data.frame(
  x = c(1,2,3,4,5),
  y = c(2,3,4,5,6)
)
dummy 
  x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6

If you want to just log2 transform data, just use log(., base = 2) like

log(dummy, base = 2)
         x        y
1 0.000000 1.000000
2 1.000000 1.584963
3 1.584963 2.000000
4 2.000000 2.321928
5 2.321928 2.584963

If you want log2(x 1) then log(dummy 1, base = 2), or if you want log2(x) 1 just log(dummy, base = 2) 1

CodePudding user response：

Park's answer gives the simplest way to log transform a numeric only data.frame but log(x 1, base = b) is a different problem.

`log(x 1)`

But if the transformation is y <- log(x 1) (could be base 2), then beware of floating-point issues. For very small values of abs(x) the results of log(x 1, base = b) are unreliable.

x <- seq(.Machine$double.eps, .Machine$double.eps^0.5, length.out = 10)
eq <- log(x   1) == log1p(x)
eq
#[1]  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE
which(eq)
#[1]  1  4  7 10

This is why base R has a function log1p. To compute log(x 1, base = 2) or, equivalently, log2(x 1), use

log2p1 <- function(x) log1p(x)/log(2)

eq2 <- log2(x   1) == log2p1(x)

eq2
# [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE
which(eq2)
#[1]  7 10

In both case the difference between log(x 1) and the numerically more accurate version is smaller in absolute value than .Machine$double.eps.

abs(log(x   1) - log1p(x)) < .Machine$double.eps
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
abs(log2(x   1) - log2p1(x)) < .Machine$double.eps
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE