I have the following code in R that combines multiple (177) csv files. However in a lot of the files, some column names have spaces and the others have underscores as separators e.g 'Article Number' and 'Article_Number'. I have tried janitor::make_clean_names and make.names etc within the code but I just cannot figure out the correct way to do it.
Any help much appreciated
df <- list_of_files %>%
set_names() %>%
map_dfr(
~read_csv(.x, col_types = cols(.default = "c", 'TY Stock Value' = "c"), col_names = TRUE,),
.id = "file_name"
)
CodePudding user response:
You can add it insight the map_dfr
function such that each columns get first harmoized before it gets bind together.
df <- list_of_files %>%
set_names() %>%
map_dfr(~ .x %>%
read_csv(.,
col_types = cols(.default = "c", "TY Stock Value" = "c"),
col_names = TRUE
)
%>%
janitor::clean_names(),
.id = "file_name"
)
EDIT: Step-by-step
There are several ways to tell map which function to use.
The ~
operator creates a formula (or better that I started an anonymous function), i.e. a shortcut for a function. And the argument of the function is .x
which is in your case one csv-filename. This filename get send via the pipe to the read_csv
function. There I used the placeholder .
to tell the function where to put it. Then it reads the data into R and then send it to the clean_names
function to harmonize names. Finally, you add .id
from map_dfr
function. That's all the purrr
magic :)