In this SO post the accepted answer shows how to remove a prefix from a subset of column names. I will reproduce the toy data and solution and get to my issue. Note that I have altered the toy data by adding a suffix (_end
) to two of the variables.
df <- data.frame(ATH_V1 = rnorm(10), ATH_V2_end = rnorm(10), ATH_V3_end = rnorm(10), ATH_V4 = rnorm(10), ATH_V5 = rnorm(10), ATH_V6 = rnorm(10), ATH_V7 = rnorm(10))
df
# ATH_V1 ATH_V2_end ATH_V3_end ATH_V4 ATH_V5 ATH_V6 ATH_V7
# 1 -1.5520380 1.16782520 -0.3628090 1.5238728 -1.1660806 -1.01416226 -0.95163564
# 2 0.6270134 1.63810443 0.2199733 -0.6175186 -1.8909463 -0.23913125 -0.70650296
# 3 -0.7462879 0.08504734 0.6506818 -0.5436457 1.3369322 1.69883194 -1.07623124
# 4 0.3196569 0.95782069 -0.3454795 -1.7485607 2.3896003 1.24958489 -0.73316675
# 5 -0.8820414 -2.01739089 -0.5881156 1.2725712 1.4251221 0.56213069 -0.47188011
# 6 -0.5534390 1.48974625 -0.2532402 -1.2333677 1.6690452 -0.48178503 0.30727117
# 7 -0.4637729 -1.13762829 1.3072153 1.0082090 -1.7958189 -1.37604307 -0.08900913
# 8 -0.3878013 -1.09693619 -0.9022672 0.1809460 -1.0303186 0.54576930 -0.64634653
# 9 -0.9553941 0.91495814 -0.2993733 -0.5860527 -0.5623538 -0.24521585 0.21297231
# 10 2.2891475 0.05568124 -0.1718192 0.4249103 2.6009601 0.06357305 0.47794076
I would like to remove the ATH_
prefix ONLY from the columns that end with _end
.
Now the solution in the original post proposed the following code, where we specify the column names we want to operate on in a vector within rename_at
and then remove the ATH_
prefix via the str_remove
function, like so
df %>% rename_at(c("ATH_V2_end", "ATH_V3_end"), ~ .x %>% str_remove("^ATH_"))
# ATH_V1 V2_end V3_end ATH_V4 ATH_V5 ATH_V6 ATH_V7
# 1 1.14822123 -0.6285561 0.52458507 -0.63906454 1.1401342 -1.6559726 0.41732258
# 2 0.07519307 2.0090135 0.13440368 1.24337727 -0.2906335 -0.1349698 1.45647898
# 3 -0.87465492 -1.8766134 -0.17119197 -1.22701678 -0.7603659 0.1015543 -1.06211069
# 4 1.01402581 -0.4744169 0.78326842 -0.02910686 0.1548202 1.0042147 -0.23739832
# 5 1.00613252 -1.5701097 1.64415870 0.86733910 0.1558727 0.3011537 0.05700506
# 6 -1.01416351 -1.7687648 -0.13999833 -1.01482747 -0.5732621 -0.2504362 2.20762232
# 7 1.00861721 0.7494679 0.08853307 1.46402775 -0.1153655 0.8427913 -1.16114455
# 8 0.28117809 -0.6669487 -0.50816389 -0.12875270 0.7798111 -0.3937148 -1.30894602
# 9 -0.23092640 2.8516271 -1.36959691 -0.39303227 1.9862182 1.2378769 -1.66039502
# 10 0.65034202 0.9009923 0.58264859 0.50931251 1.7284268 1.8420746 -0.71894637
However the help for the new dplyr suite of packages states that rename_at
has been superseded by rename_with
and that you can use some of the powerful functionality of the select
functions to choose a subsets of columns.
So I would like to remove the ATH_
prefix ONLY from the columns that end with _end
using the ends_with()
function within rename_with()
using tidyverse grammar.
I tried
df %>%
select(ends_with("_end")) %>%
rename_with(str_remove(string = ~.x,
pattern = "^ATH_"))
and
df %>%
rename_with(cols = ends_with("_end"),
.fn = str_remove(string = ~.x,
pattern = "^ATH_"))
And got the same error
Error in `rename_with()`:
! Can't convert `.fn`, a character vector, to a function.
Any help much appreciated
CodePudding user response:
If you use select
to filter the columns, those columns will no longer be a part of the data frame. You're on the right track, though.
If you don't use the tilde with .x
to represent the dynamic field name, you have to use function
, literally.
For example, you can use the tilde, like this:
rename_with(df, .cols = ends_with("_end"),
~ gsub("^ATH_", "", .x))
Or you can designate a variable name of your choice, instead of .x
, and use function()
, like this:
rename_with(df, .cols = ends_with("_end"),
.fn = function(frenchFries) {
gsub("^ATH_", "", frenchFries)
})
You can use names()
to test your work before you change the object. The names()
function wasn't really intended for piping, but with a bit of finesse, it does the job.
rename_with(df, .cols = ends_with("_end"),
.fn = function(frenchFries) {
gsub("^ATH_", "", frenchFries)
}) %>% {names(.)}
# [1] "ATH_V1" "V2_end" "V3_end" "ATH_V4" "ATH_V5" "ATH_V6" "ATH_V7"
In R, very few libraries present objects as mutable or modified in place, so you have to assign this to an object to actually change it.
df <- rename_with(df, .cols = ends_with("_end"),
~ gsub("^ATH_", "", .x))
CodePudding user response:
You put the ~
symbol to a wrong place... It should be
df %>%
rename_with(cols = ends_with("_end"),
.fn = ~ str_remove(string = .x, pattern = "^ATH_"))
# V1 V2_end V3_end V4 V5 V6 V7
# 1 -0.7211939 -0.8369699 0.8317321 -0.05233632 0.05711023 -1.1028795 -0.44261881
# 2 -1.2497923 -0.9062427 1.6472891 -0.77403163 -0.37941031 -0.8270005 1.14721669
# 3 -0.1343481 -1.2049003 0.5347915 0.16202132 -0.38939422 -1.6720070 -1.55429956
# 4 0.1664160 1.9248057 -0.1133589 -0.48717961 0.89363994 1.0983927 0.82700398
# 5 -1.0916865 -0.8093323 -1.3128583 -0.68529918 -0.22614257 0.3307024 -2.45071083
# 6 0.4191887 1.6177852 1.7017075 1.40316160 -1.30115133 -0.6129785 1.28648456
# 7 0.8725919 -0.2706190 1.3131828 -2.99366849 1.28976332 -0.2348865 1.09045642
# 8 -0.5935664 -0.2918142 0.7699294 -1.30566644 -1.53736071 -0.2689142 0.10605338
# 9 1.4284704 -0.3578967 -0.8106887 1.04486145 -0.32881870 0.2486389 0.08226489
# 10 1.2323733 -0.2241655 0.2167915 -0.31868072 -0.74497243 -1.7778882 -0.70894820
More concise expression is
df %>%
rename_with(~ str_remove(.x, "^ATH_"), ends_with("_end"))
and even
df %>%
rename_with(str_remove, ends_with("_end"), "^ATH_")