I have the following example:
structure(list(value = c("./LRoot_1/LClass_copepodo", "./LRoot_1/LClass_shadow",
"./LRoot_2/LClass_bolha", "./LRoot_2/LClass_cladocera")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L))
I would to like separate this in two columns, the first column with the names that are between bars "/" and the second column with the name after "LClass_"
Thanks all
CodePudding user response:
We could use extract
- capture ((...)
) the substrings that we needed while remove the rest of characters i.e, below regex matches the .
(\\.
- metacharacter that matches any character thus escaped), followed by the /
, then capture the one or more (
) characters that are not a /
([^/]
) as the first capture group, followed by matching the 'LCass_' substring and then capture the rest of characters (.*
) as second capture group
library(tidyr)
extract(df1, value, into = c("first", "second"),
"\\./([^/] )/LClass_(.*)", remove = FALSE)
-output
# A tibble: 4 × 3
value first second
<chr> <chr> <chr>
1 ./LRoot_1/LClass_copepodo LRoot_1 copepodo
2 ./LRoot_1/LClass_shadow LRoot_1 shadow
3 ./LRoot_2/LClass_bolha LRoot_2 bolha
4 ./LRoot_2/LClass_cladocera LRoot_2 cladocera
CodePudding user response:
Here is an alternative option using separate
and word
:
library(dplyr)
library(tidyr)
library(stringr)
df %>%
separate(col = value, into = c("first", "second"), sep = "/(?=LClass)") %>%
mutate(first = word(first, 2, sep = "/"),
second = word(second, 2, sep = "_"))
# A tibble: 4 x 2
first second
<chr> <chr>
1 LRoot_1 copepodo
2 LRoot_1 shadow
3 LRoot_2 bolha
4 LRoot_2 cladocera
CodePudding user response:
You can separate()
the column at /
then remove the pattern LClass_
via mutate()
:
library(dplyr)
library(tidyr)
DF %>%
separate(col = value, into = c(NA, 'a', 'b'), sep = '/', remove = FALSE) %>%
mutate(
b = gsub(pattern = 'LClass_', replacement = '', x = b)
)
#> # A tibble: 4 × 3
#> value a b
#> <chr> <chr> <chr>
#> 1 ./LRoot_1/LClass_copepodo LRoot_1 copepodo
#> 2 ./LRoot_1/LClass_shadow LRoot_1 shadow
#> 3 ./LRoot_2/LClass_bolha LRoot_2 bolha
#> 4 ./LRoot_2/LClass_cladocera LRoot_2 cladocera