I need to format numeric columns of a data frame showing scientific notation only when the number is less than 0.0001. I have written the following code where format
function has been used. The problem with this code is that it transforms all numbers.
Any suggestion?
col1 <- c(0.00002, 0.0001, 0.5689785541122558)
col2 <- c(3.5, 45.6546548788, 12585.5663)
tab <- cbind(col1, col2)
tab <- as.data.frame(tab)
format(tab[1], digit = 1, nsmall = 3)
CodePudding user response:
1) dplyr Define a vectorized format and use that in mutate/across:
formatv <- function(x, ...) {
mapply(format, x, scientific = abs(x) < 0.0001, ...)
}
library(dplyr)
tab %>% mutate(across(, formatv, digit = 1, nsmall = 3))
2) Base R or with only base R (formatv is from above)
replace(tab, TRUE, lapply(tab, formatv, digit = 1, nsmall = 3))
or
replace(tab, TRUE, formatv(as.matrix(tab), digits = 1, nsmall = 3))
or if you have a small number of columns do each individually
transform(tab,
col1 = formatv(col1, digits = 1, nsmall = 3),
col2 = formatv(col2, digits = 1, nsmall = 3))
3) collapse formatv is from above.
library(collapse)
ftransformv(tab, names(tab), formatv, digit = 1, nsmall = 3)
4) purrr map_dfc in purrr can be used. formatv is from above.
library(purrr)
tab %>% map_dfc(formatv, digit = 1, nsmall = 3)
CodePudding user response:
You could apply
on both margins 1:2
.
as.data.frame(apply(tab, 1:2, \(x) format(x, digits=1, nsmall=3)))
# col1 col2
# 1 2e-05 3.500
# 2 1e-04 45.655
# 3 0.569 12585.566
Or if you want to format just one specific column:
transform(tab, col1=sapply(col1, format, digits=1, nsmall=3))
# col1 col2
# 1 2e-05 3.50000
# 2 1e-04 45.65465
# 3 0.569 12585.56630
Important just is, that each element is formatted individually.
Here another way using replace
.
tab |>
round(5) |>
(\(.) replace(., . < 1e-4, format(.[. < 1e-4], digit=1, nsmall=3)))()
# col1 col2
# 1 2e-05 3.50000
# 2 1e-04 45.65465
# 3 0.56898 12585.56630
CodePudding user response:
lapply(tab, \(x) ifelse(abs(x) < 0.0001, format(x, scientific=TRUE), format(x, scientific=FALSE)))
# $col1
# [1] "2.000000e-05" "0.0001000" "0.5689786"
# $col2
# [1] " 3.50000" " 45.65465" "12585.56630"
You can reassign back into the frame if you'd like with tab[] <- lapply(tab, ...)
. Note that all columns are now character
not numeric
.
This can be done perhaps slightly more efficiently by working on the matrix
, now no need for lapply
:
tab <- cbind(col1, col2)
ifelse(tab < 0.0001,
format(tab, digit=1, nsmall=3),
format(tab, digit=1, nsmall=3, scientific=FALSE))
# col1 col2
# [1,] "2e-05" " 3.50000"
# [2,] " 0.00010" " 45.65465"
# [3,] " 0.56898" "12585.56630"
which can then be converted into a frame.