Home > database >  How do you rename columns if they exist in the data frame?
How do you rename columns if they exist in the data frame?

Time:06-11

I am reading in an Excel that gets periodically updated with new columns. I need to conditionally evaluate the renaming piece depending on if each column is present. Otherwise move on. How do you conditionally update the naming, say if someone didn't yet add in this months data, current_month_program_acceptance_rate? So it doesn’t break.

library(lubridate)
library(tidyverse)
library(data.table)

dataset_ex <- data.frame(current_month_program_acceptance_rate = c(22.2,44.2,87),
                         last_month_program_acceptance_rate = c(27.8,65.8,34.5))


data_variable <- "June"
last_month <- "April"

# How do you only evaluate this if column is present and continue on if other columns are not present

dataset_ex <-
  dataset_ex %>%
  rename("{data_variable}_program_acceptance_rate" := current_month_program_acceptance_rate)

dataset_ex <-
  dataset_ex %>%
  rename("{last_month}_program_acceptance_rate" := last_month_program_acceptance_rate)


CodePudding user response:

In R if a column is not present in a data frame it will evaluate as NULL.

> dataset_ex$this_column_is_not_here
NULL

So you can use is.null to check if the column is present.

if (!is.null(dataset_ex$current_month_program_acceptance_rate)){
  dataset_ex <-
    dataset_ex %>%
    dplyr::rename("{data_variable}_program_acceptance_rate" := "current_month_program_acceptance_rate")
}

CodePudding user response:

You should use rename_with: ie

df1 <- data.frame(last_month_program_acceptance_rate = 0.8)

Note that df1 has only one available column:

data_variable <- "June"
last_month <- "April"
maintained_part <- "program_acceptance_rate"
nms <- paste(c(data_variable, last_month), maintained_part, sep = '_')
names(nms) <- paste(c("current_month", 'last_month'),maintained_part, sep = '_')

df1 <- df1 %>%
  rename_with(~str_replace_all(.,nms))
df1
  
April_program_acceptance_rate
1                           0.8

We can see that only the available columns are replaced:

Now df2 has all variables:

df2 <- data.frame(last_month_program_acceptance_rate = 0.8,
                  current_month_program_acceptance_rate = 0.9)

df2 <- df2 %>% rename_with(~str_replace_all(.,nms))
df2

  April_program_acceptance_rate June_program_acceptance_rate
1                           0.8                          0.9
  • Related