I am working with R.
I have many tibbles that look similar to this (I say similar because they contain 5 columns at the beginning and 11 columns at the and that need to be eliminated)
chair table hill block chain ball money house
2 4 5 6 7 -2 4 5
1 3 6 1 8 3 9 1
-2 1 1 -2 1 8 -2 3
6 4 -2 4 -2 5 8 4
5 5 5 5 3 2 6 7
First I need to eliminate the columns that I don't need at the beginning and at the end of the tibble. So, I used this code.
dummy1 <- at01 %>%
select(-1, -2, -3, -4, -5, -name, -time, -id, -class,
-slot, -bracket, -app, -aal, -PHT, -END,
-START)
Then, I need to rename the columns by eliminating the first 4 characters of the column names with this code.
names(dummy1) <- substring(names(dummy1), 5)
And, finally, I need to replace all the -2 with NA's
dummy1[dummy1 == -2] <- NA
I have 120 documents in my global environment. How can I pass all the code in order to obtain clean results rapidly without the need of doing these three steps with every tibble.
Thanks
CodePudding user response:
When applying the same functions to multiple dataframes, it is easiest to work with them in a list, so that you can use functions from apply
or purrr
. Here, I am assuming that only the 120 dataframes are in your global environment. I first put all dataframes in your global environment into a list. Then, we can apply your various functions (I put it all into one function) on each dataframe to clean them up. Then, you can apply that function using purrr::map
to each dataframe.
library(tidyverse)
# Put all dataframes in your global environment into a list.
dfs <- Filter(function(x) is(x, "data.frame"), mget(ls()))
cleanup <- function(x){
x <- x %>%
select(-c(1:5, name, time, id, class,
slot, bracket, app, aal, PHT, END,
START)) %>%
mutate_all(~replace(., . == -2, NA))
names(x) <- substring(names(x), 5)
return(x)
}
dfs_cleanedup <- purrr::map(dfs, cleanup)
Then, if you want to overwrite the 120 dataframes in your global environment with the cleaned dataframes, then you can do something like this (Caution: it will overwrite the dataframes in the global environment).
list2env(dfs_cleanedup, envir = .GlobalEnv)