Test <- tribble(~Date, ~HCl_Konz, ~HCl_Kenn, ~CO_Konz, ~CO_Kenn,
1, 4, "", 4, "",
2, 5, "", 1, "",
3, 2, "X", 6, "BX",
4, 5, "", 4, "",
5, 6, "F", 4, "",
6, 5, "", 9, "EXr")
Param <- c("HCl", "CO")
The real tibble is much bigger and has several columns like the HCl and CO, but they all follow the same scheme. For all of these columns I want to set the value of HCl_Konz to NA, if the Column HCl_Kenn has at least one of the chars "X" or "F", the same with CO_Konz (if CO_Kenn includes X or F), and all the oder XXX_Konz columns.
I tried the following code, but it quits with the following error.
Test %>% rowwise() %>%
mutate(across(paste(Param, "_Konz", sep=""), ~ ifelse(str_detect(paste(str_sub(cur_column(),1,-6), "_Kenn", sep=""), "[XF]"), NA_real_, .x)))
The code doesn't throw an error, but the values are not replaced by NA.
tia
CodePudding user response:
- You're missing the
~
to mark theifelse(..)
as a function of sorts. cur_col()
not found (for me), should likely be.
or.x
- You are
str_detect
ing in the name of the_Kenn
-equivalent column, not the values in that column; we need to addcur_data()[[..]]
as well.
I tend to not use stringr
for straight-forward replacements like this, preferring base R:
library(dplyr)
Test %>%
mutate(
across(
paste0(Param, "_Konz"),
~ if_else( grepl("[XF]", cur_data()[[ gsub("_Konz", "_Kenn", cur_column()) ]] ),
.[NA], . )
)
)
# # A tibble: 6 x 5
# Date HCl_Konz HCl_Kenn CO_Konz CO_Kenn
# <dbl> <dbl> <chr> <dbl> <chr>
# 1 1 4 "" 4 ""
# 2 2 5 "" 1 ""
# 3 3 NA "X" NA "BX"
# 4 4 5 "" 4 ""
# 5 5 NA "F" 4 ""
# 6 6 5 "" NA "EXr"
I recommend dplyr::if_else
in place of ifelse
for several reasons, but it comes with the strict (and safe!) requirement that the true=
and false=
arguments be precisely the same type. You recognize at least most of this by your use of NA_real_
; my use of .[NA]
is another way of ensuring that we get the correct NA
-variant based on the actual data, allowing this method to work if some of your Params
are integer
and some are numeric
, for example.
An alternative approach (which may help later) is to pivot the data and work with just two columns at a time.
library(tidyr) # pivot_longer
Test %>%
pivot_longer(
matches("_(Konz|Kenn)$"),
names_pattern = "(.*)_(.*)", names_to = c("elem", ".value")
) %>%
mutate(
Konz = if_else(grepl("[XF]", Kenn), Konz[NA], Konz)
)
# # A tibble: 12 x 4
# Date elem Konz Kenn
# <dbl> <chr> <dbl> <chr>
# 1 1 HCl 4 ""
# 2 1 CO 4 ""
# 3 2 HCl 5 ""
# 4 2 CO 1 ""
# 5 3 HCl NA "X"
# 6 3 CO NA "BX"
# 7 4 HCl 5 ""
# 8 4 CO 4 ""
# 9 5 HCl NA "F"
# 10 5 CO 4 ""
# 11 6 HCl 5 ""
# 12 6 CO NA "EXr"
This pivoted format has the advantage of allowing simpler calls to mutate
, and (if you plan on plotting this) playing much better with ggplot2
's preference for long data.