I have a datatable in R with a column that contains multiple values separated by a comma and I would like to extract the last value from each row in that column. However, the number of values in the different rows vary. Some times there are 3 or 4 values separated by commas and sometimes there is only one. I always need the last value.
The rows look basically like this.
[1.848]
[0.207,0.403]
[0.3,0.29]
[0.948,3.064]
[0.431,0.431,0.214]
[0.305,0.305,0.30]
[0.213,0.42]
[0.31,0.30]
[1.484,2.433]
That would be my preferred outcome:
1.848
0.403
0.29
3.064
0.214
0.30
0.42
0.30
2.433
Maybe someone as an idea on how to solve that in R.
Thanks a lot!
CodePudding user response:
Here a solution that I use ,
as reference to get the last number.
library(dplyr)
library(stringr)
x <- c("1.848", "0.207,0.403", "0.3,0.29", "0.948,3.064", "0.431,0.431,0.214",
"0.305,0.305,0.30", "0.213,0.42", "0.31,0.30", "1.484,2.433")
df <- tibble(x = x)
last_number <- function(x){
as.numeric(word(x,str_count(x,",") 1,sep = ","))
}
df %>%
mutate(last_number = last_number(x))
# A tibble: 9 x 2
x last_number
<chr> <dbl>
1 1.848 1.85
2 0.207,0.403 0.403
3 0.3,0.29 0.29
4 0.948,3.064 3.06
5 0.431,0.431,0.214 0.214
6 0.305,0.305,0.30 0.3
7 0.213,0.42 0.42
8 0.31,0.30 0.3
9 1.484,2.433 2.43
CodePudding user response:
We may use trimws
from base R
df1$col1 <- as.numeric(trimws(df1$col1, whitespace = ".*,|[][]"))
-output
> df1
col1
1 1.848
2 0.403
3 0.290
4 3.064
5 0.214
6 0.300
7 0.420
8 0.300
9 2.433
data
df1 <- structure(list(col1 = c("[1.848]", "[0.207,0.403]", "[0.3,0.29]",
"[0.948,3.064]", "[0.431,0.431,0.214]", "[0.305,0.305,0.30]",
"[0.213,0.42]", "[0.31,0.30]", "[1.484,2.433]")),
class = "data.frame", row.names = c(NA,
-9L))