I want to fill a row with "-3" from some specific columns (B and E in this example) to the end when these columns contain "-3" in that row. I figured out a solution, but it is extremely slow in my original dataset (2435 x 431 cells) and with 15 columns to check for values == "-3".
In this example, the rows to fill with "-3" are 4 and 10 from column "B" and 3 from column "D". Note that 4 and 10 also contain values == "-3" in column "E" but they were already filled when iterated over column "B"
library(tidyverse)
values <- as.character(-3:3)
set.seed(123)
data <- tibble(
A = sample(values, 10, replace = T),
B = sample(values, 10, replace = T),
C = sample(values, 10, replace = T),
D = sample(values, 10, replace = T),
E = sample(values, 10, replace = T),
F = sample(values, 10, replace = T)
)
fill_minus_three <- function(x){
for (i in 1:length(x)){
if ((names(x)[i] %in% c("B", "E")) && x[i] == "-3"){
x[i:length(x)] <- "-3"
break
}
}
return(x)
}
t(apply(data, 1, fill_minus_three)) %>%
as_tibble()
#> # A tibble: 10 x 6
#> A B C D E F
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3 0 0 -1 1 1
#> 2 3 2 -3 0 3 -1
#> 3 -1 2 -3 2 -3 -3
#> 4 2 -3 -3 -3 -3 -3
#> 5 -1 -2 -1 -1 -2 -2
#> 6 -2 -1 -2 3 3 1
#> 7 -2 1 3 1 -1 1
#> 8 2 -1 -2 0 0 0
#> 9 -1 -1 -3 3 1 3
#> 10 1 -3 -3 -3 -3 -3
In adittion, I would like to use map_* family since the rest of the scripts follow the tidyverse approach (however, this is optional).
CodePudding user response:
Si entenc bé, you're trying to change values in multiple columns according to the values in columns B and E.
No need for for loops or map/apply functions, you can just use mutate
and pair it with across
:
library(dplyr)
data |>
mutate(across(C:F, ~ if_else(B == "-3", "-3", .x)),
F = if_else(E == "-3", "-3", F))
Output
#> # A tibble: 10 × 6
#> A B C D E F
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3 0 0 -1 1 1
#> 2 3 2 -3 0 3 -1
#> 3 -1 2 -3 2 -3 -3
#> 4 2 -3 -3 -3 -3 -3
#> 5 -1 -2 -1 -1 -2 -2
#> 6 -2 -1 -2 3 3 1
#> 7 -2 1 3 1 -1 1
#> 8 2 -1 -2 0 0 0
#> 9 -1 -1 -3 3 1 3
#> 10 1 -3 -3 -3 -3 -3
Created on 2022-06-02 by the reprex package (v2.0.1)
CodePudding user response:
I give a thought and I realized that I do not need to iterate over all columns, just the ones I need to check for value == "-3". So I changed the function a bit, I works much faster (0.3 s in the entire dataset while previous took two minutes). Yet, it is not tidy-friendly. :/
positions <- which(names(data) %in% c("B", "E"))
fill_minus_three <- function(x, positions){
for (i in positions){
if (x[i] == "-3"){
x[i:length(x)] <- "-3"
break
}
}
return(x)
}
t(apply(data, 1, function(x) fill_minus_three(x, positions))) %>%
as_tibble()