I need a little help modifying a script to make it simpler and less "hard-coded". The below code creates a data frame of 10 columns with headers "Find" followed by two rows. The first row is the column name to be replaced, and the second row is the name to replace with.
names_subs_list<-data.frame("Find1"=c("Fecha","smpl_date"),
"Find2"=c("1reCODIGODEARBOL","first_tree_code"),
"Find3"=c("2doCODIGODEARBOL","second_tree_code"),
"Find4"=c("ALTURADELACAMARAENELFUSTE","chamber_height_and_rep"),
"Find5"=c("Nombredearchivo","LICOR_CO2_data_file_name"),
"Find6"=c("Especiedearbol","tree_spp"),
"Find7"=c("Horadecerrarlaoruga","raw_start_time"),
"Find8"=c("Horadeabrirlaoruga","raw_end_time"),
"Find9"=c("Nombredecamarausada","chamber_number"),
"Find10"=c("PENDIENTE/FRECUENCIA(ppb)","spot_flux"), stringsAsFactors = F)
I'd like to separate this section of code to a new function or package that could be opened and modified as needed without having to work in the larger QC script. Ideally non- R savvy individuals could enter their own combinations of replacement and replacer values that will be added to this dataframe for future use. My idea for this was a function that would save the inputs into this list.
add_to_list <- function(original_header, replacement_header){
substitutes <- data.frame("Find1"=c(original_header, replacement_header)
,stringsasFactors = F)
}
Something like the above with a for loop that adds each new addition onto the end of the dataframe. This way new users could enter add_to_list(their_original_headings, their_replacements). If anyone knows how this can be done, or has a better suggestion please let me know.
CodePudding user response:
Iteratively adding columns is fine, not anywhere near as bad as iteratively adding rows (see "Growing Objects" in The R Inferno).
I suggest making your function accept the input data (not trying to find and overwrite it in the calling environment/frame) and returning it.
add_to_list <- function(data = NULL, orig, repl) {
if (is.null(data)) {
data <- data.frame(Find1 = c(orig, repl))
} else {
maxnum <- suppressWarnings(max(as.integer(gsub("\\D", "", colnames(data)))))
if (is.na(maxnum) || !length(maxnum)) maxnum <- 0L
newdata <- data.frame(a = c(orig, repl))
names(newdata)[1] <- paste0("Find", maxnum 1L)
data <- cbind(data, newdata)
}
data
}
## initial use?
add_to_list(orig="PENDIENTE/FRECUENCIA(ppb)", repl="spot_flux")
# Find1
# 1 PENDIENTE/FRECUENCIA(ppb)
# 2 spot_flux
Now starting from a pre-built frame, note that I commented out "Find10"
:
names_subs_list <- data.frame(
"Find1"=c("Fecha","smpl_date"),
"Find2"=c("1reCODIGODEARBOL","first_tree_code"),
"Find3"=c("2doCODIGODEARBOL","second_tree_code"),
"Find4"=c("ALTURADELACAMARAENELFUSTE","chamber_height_and_rep"),
"Find5"=c("Nombredearchivo","LICOR_CO2_data_file_name"),
"Find6"=c("Especiedearbol","tree_spp"),
"Find7"=c("Horadecerrarlaoruga","raw_start_time"),
"Find8"=c("Horadeabrirlaoruga","raw_end_time"),
"Find9"=c("Nombredecamarausada","chamber_number"),
# "Find10"=c("PENDIENTE/FRECUENCIA(ppb)","spot_flux")
stringsAsFactors = F)
names_subs_list <- add_to_list(names_subs_list, "PENDIENTE/FRECUENCIA(ppb)", "spot_flux")
names_subs_list
# Find1 Find2 Find3 Find4 Find5 Find6 Find7 Find8 Find9 Find10
# 1 Fecha 1reCODIGODEARBOL 2doCODIGODEARBOL ALTURADELACAMARAENELFUSTE Nombredearchivo Especiedearbol Horadecerrarlaoruga Horadeabrirlaoruga Nombredecamarausada PENDIENTE/FRECUENCIA(ppb)
# 2 smpl_date first_tree_code second_tree_code chamber_height_and_rep LICOR_CO2_data_file_name tree_spp raw_start_time raw_end_time chamber_number spot_flux
names_subs_list <- add_to_list(names_subs_list, "something else", "again")
names_subs_list
# Find1 Find2 Find3 Find4 Find5 Find6 Find7 Find8 Find9 Find10 Find11
# 1 Fecha 1reCODIGODEARBOL 2doCODIGODEARBOL ALTURADELACAMARAENELFUSTE Nombredearchivo Especiedearbol Horadecerrarlaoruga Horadeabrirlaoruga Nombredecamarausada PENDIENTE/FRECUENCIA(ppb) something else
# 2 smpl_date first_tree_code second_tree_code chamber_height_and_rep LICOR_CO2_data_file_name tree_spp raw_start_time raw_end_time chamber_number spot_flux again