Home > Software design >  what does bracket [] do when assigning lapply output
what does bracket [] do when assigning lapply output

Time:12-15

I have a variable v as follows:

> head(v)
    C1   C2   C3   C4   C5   C6   C7   C8   C9  C10
1 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2
2 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1
3 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1
4 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1
5 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2
6 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2

I want to get ride of the trailing ".m1" for each element of the dataframe.

When I do lapply, it gives me a list of C1, C2, ...

> lapply(v, function(x) as.numeric(gsub("\\..*", "", x))) %>% str
List of 10
 $ C1 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C2 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C3 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C4 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C5 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C6 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C7 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C8 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C9 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
 $ C10: num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...

However, I want a dataframe with the dimension staying the same, so I do the following and it works

> v[]=lapply(v, function(x) as.numeric(gsub("\\..*", "", x)))
> head(v)
  C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
1  0  0  0  0  0  0  0  0  0   0
2  0  0  0  0  0  0  0  0  0   0
3  0  0  0  0  0  0  0  0  0   0
4  0  0  0  0  0  0  0  0  0   0
5  0  0  0  0  0  0  0  0  0   0
6  0  0  0  0  0  0  0  0  0   0

Is [] needed to change each element of v? Is there a better way to code this? Thank you.

CodePudding user response:

  • v is a data.frame, which is essentially (but not perfectly) a list where all elements are named and are the same lengths.
  • lapply always returns a list. Period. It doesn't care that its input came from a frame, that is not its intent.
  • v = lapply(..) replaces the object reference named "v" with the new object, which is (as stated above) a list. However ...
  • v[] = lapply(..) replaces the contents of v with the return from lapply(..) without changing the class and attributes of v, so it remains a frame with the list-contents returned by lapply. Realize that the same effect can be had with v = data.frame(lapply(..)).

CodePudding user response:

Since you are asking what other, possibly better, method there would be, here's a tidyverse solution:

library(tidyverse)  
df %>%
   mutate(across(everything(), ~as.numeric(str_extract(., "\\d "))))
  C1 C2 C3
1  0  0  0
2  0  0  0
3  0  0  0

Instead of gsub(or better sub, as we're dealing with a single match per string), as in your solution (which is of course possible too but slightly more verbose), we're using here str_extract to extract the string-first digit(s).

Data:

df <- data.frame(
  C1 = c("0.m1", "0.p2", "0.p1"),
  C2 = c("0.p1", "0.p0", "0.p1"),
  C3 = c("0.m2", "0.p1", "0.p1")
)
  • Related