I have a variable v
as follows:
> head(v)
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
1 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2
2 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1
3 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1
4 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1
5 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2
6 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2
I want to get ride of the trailing ".m1" for each element of the dataframe.
When I do lapply, it gives me a list of C1, C2, ...
> lapply(v, function(x) as.numeric(gsub("\\..*", "", x))) %>% str
List of 10
$ C1 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C2 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C3 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C4 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C5 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C6 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C7 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C8 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C9 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C10: num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
However, I want a dataframe with the dimension staying the same, so I do the following and it works
> v[]=lapply(v, function(x) as.numeric(gsub("\\..*", "", x)))
> head(v)
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
1 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0
Is [] needed to change each element of v? Is there a better way to code this? Thank you.
CodePudding user response:
v
is adata.frame
, which is essentially (but not perfectly) alist
where all elements are named and are the same lengths.lapply
always returns alist
. Period. It doesn't care that its input came from aframe
, that is not its intent.v = lapply(..)
replaces the object reference named"v"
with the new object, which is (as stated above) alist
. However ...v[] = lapply(..)
replaces the contents ofv
with the return fromlapply(..)
without changing the class and attributes ofv
, so it remains a frame with the list-contents returned bylapply
. Realize that the same effect can be had withv = data.frame(lapply(..))
.
CodePudding user response:
Since you are asking what other, possibly better, method there would be, here's a tidyverse
solution:
library(tidyverse)
df %>%
mutate(across(everything(), ~as.numeric(str_extract(., "\\d "))))
C1 C2 C3
1 0 0 0
2 0 0 0
3 0 0 0
Instead of gsub
(or better sub
, as we're dealing with a single match per string), as in your solution (which is of course possible too but slightly more verbose), we're using here str_extract
to extract the string-first d
igit(s).
Data:
df <- data.frame(
C1 = c("0.m1", "0.p2", "0.p1"),
C2 = c("0.p1", "0.p0", "0.p1"),
C3 = c("0.m2", "0.p1", "0.p1")
)