My dataset looks like this:
library(dplyr)
a <- rnorm(N)
b <- rnorm(N)
c_04 <- rnorm(N)
d_04 <- rnorm(N)
e_04 <- rnorm(N)
df <- data.frame(a, b, c_04, d_04, e_04)
Is there a way I can use the rename_with
function to change the variables that end with _04
to drop the _04
. In other words, the variables in df
should only be a
, b
, c
, d
.
Thank you.
CodePudding user response:
You can use str_remove()
library(stringr)
library(dplyr)
df %>%
rename_with( ~ str_remove(., "_04"))
Or maybe more generally. Basically just use str_remove()
(or another similar function) with whatever pattern you need depending on the problem.
df %>%
rename_with( ~ str_remove(., "_\\d "))
CodePudding user response:
To add to previous answers, I like to add the .cols
argument in conjunction with dplyr::ends_with()
to make the code less mistake prone. This can be useful if you have more complex names. For example you might have a column name containing _04 but not at the end of the string. The previous answer will remove this regardless.
library(tidyverse)
N=1
a <- rnorm(N)
b <- rnorm(N)
c_04 <- rnorm(N)
d_04 <- rnorm(N)
e_04 <- rnorm(N)
weird_04_name_05 <- rnorm(N)
df <- data.frame(a, b, c_04, d_04, e_04,weird_04_name_05)
df %>% rename_with(.fn = ~ str_replace(.x, "_04", ""),
.cols = ends_with("_04"))
CodePudding user response:
Or another option using stringr
and set_names
:
library(tidyverse)
df %>%
set_names(str_remove, "_.*")
Output
a b c d e
1 -0.6706685 2.05351983 -0.7972316 -0.1520679 -0.7714376
2 -1.7739331 1.45570354 -0.6012567 0.2613097 -0.7914683
3 -0.7719231 0.04259273 0.3809469 1.2360435 0.8250286
Or in base R:
setNames(df, gsub("_.*", "", names(df)))
Or with data.table
:
library(data.table)
setnames(setDT(df), str_remove(names(dt), "_.*"))
Data
df <- structure(list(a = c(0.894805325864747, -1.94185093341678, -1.00994988512899
), b = c(0.77908390827311, -0.0204816421929252, -0.346331859636578
), c_04 = c(-0.18087870239403, -0.275192762246937, -0.494661273775676
), d_04 = c(-0.206752223705721, -0.560550718406792, 1.45531474529632
), e_04 = c(-0.929914176494227, 1.76975758055254, -0.387603128597527
)), class = "data.frame", row.names = c(NA, -3L))
CodePudding user response:
Sounds like you want to rename columns in a data.frame
based on how some variable names begin (as opposed to your question title "[...] rename variables based on how the variable ends?").
If you want to do what you asked for (rename variables that you then use to build a data.frame
), you can do it like this:
c_04 <- rnorm(N)
d_04 <- rnorm(N)
e_04 <- rnorm(N)
varnames <- c("c_04", "d_04", "e_04")
for(var in varnames){
name <- sub('_.*', '', var)
assign(name, eval(parse(text = var)))
do.call(rm, list(var))
}
Just in case you want to do what you say you want to do. If not, there are plenty of other answers here.