I want to convert data frame like this:
mre <- tibble::tribble(
~folder3, ~folder2, ~folder1,
"V3=4", "V2=1", "V1=0",
"V3=5", "V2=1", "V1=0",
"V3=4", "V2=2", "V1=0",
"V3=5", "V2=2", "V1=0",
"V3=4", "V2=1", "V1=1",
"V3=5", "V2=1", "V1=1",
"V3=4", "V2=2", "V1=1",
"V3=5", "V2=2", "V1=1"
)
to this:
folder3 folder2 folder1 V3 V2 V1
V3=4 V2=1 V1=0 4 1 0
V3=5 V2=1 V1=0 5 1 0
V3=4 V2=2 V1=0 4 2 0
V3=5 V2=2 V1=0 5 2 0
V3=4 V2=1 V1=1 4 1 1
V3=5 V2=1 V1=1 5 1 1
V3=4 V2=2 V1=1 4 2 1
V3=5 V2=2 V1=1 5 2 1
Basically extracting the unique variable names ("V3, "V2", "V1" here, but could be any valid names such as "a", "b", c" ) for each folder?
column as the new column name, and keep the values in place.
I have the following for a single "folder" column by using the first row value:
mre %>%
tidyr::extract(folder1, into = .$folder1[1] |> word(1, sep="="), "\\S =(\\d )", remove = FALSE)
But I don't know how to expand to multiple "folders" columns (the number is not fixed). I tried to use map
following the answers here, but could not figure out how to get the variable names from the first row.
Any suggestions?
CodePudding user response:
Instead of extract
, we may create new columns within across
itself - mutate
across
all the columns (everything()
), use str_extract
to get the digits (\\d
) that succeeds the =
, while modifying the column names in names
with str_replace
library(dplyr)
library(stringr)
mre %>%
mutate(across(everything(),
~ as.numeric(str_extract(., "(?<=\\=)\\d ")),
.names = "{str_replace(.col, 'folder', 'V')}"))
-output
# A tibble: 8 × 6
folder3 folder2 folder1 V3 V2 V1
<chr> <chr> <chr> <dbl> <dbl> <dbl>
1 V3=4 V2=1 V1=0 4 1 0
2 V3=5 V2=1 V1=0 5 1 0
3 V3=4 V2=2 V1=0 4 2 0
4 V3=5 V2=2 V1=0 5 2 0
5 V3=4 V2=1 V1=1 4 1 1
6 V3=5 V2=1 V1=1 5 1 1
7 V3=4 V2=2 V1=1 4 2 1
8 V3=5 V2=2 V1=1 5 2 1
CodePudding user response:
A base R option
cbind(
mre,
unclass(
xtabs(
V2 ~ id factor(V1, levels = unique(V1)),
do.call(
rbind,
Map(function(x) cbind(read.table(text = x, sep = "="), id = seq_along(x)), mre)
)
)
)
)
gives
folder3 folder2 folder1 V3 V2 V1
1 V3=4 V2=1 V1=0 4 1 0
2 V3=5 V2=1 V1=0 5 1 0
3 V3=4 V2=2 V1=0 4 2 0
4 V3=5 V2=2 V1=0 5 2 0
5 V3=4 V2=1 V1=1 4 1 1
6 V3=5 V2=1 V1=1 5 1 1
7 V3=4 V2=2 V1=1 4 2 1
8 V3=5 V2=2 V1=1 5 2 1