Basically, there are a few excel reports that are emailed every morning to us. I download them directly to a specified drive then I wrangle them in R.
The thing is that I have to manually open each file and save them before running my script in R. If not this will happen:
When I go in and manually open and save the files and re-run my script. I get the correct results:
Firstly, do you guys know why this happens? and secondly, is there a function that will allow me to open these files and save them. I did try openxlsx. However, I still have to manually press the save button.
Here is the function I created to bring in the files:
store.FUN = function(x)
{
m = as.data.frame(read_excel(file))
names(m) = c(1:length(m))
m[1, 1] = str_sub(m[2, 1], 13)
m = bind_rows((m)[1, ], subset(m, (m)[1] == "Total Income"))
m[2, 1] = m[2, 2]
m = m[-c(2)]
return(m)
}
district_1.stores = sapply(store.file, store.FUN, simplify=FALSE) %>%
bind_rows(.id = "Store ID")
Thanks!
Edit: So it looks like the cells are formulated:
But, if I do nothing and only save the file and go back to R to perform the script, the numbers pull in just fine.
Here is an example of the excel file: enter image description here
CodePudding user response:
I took the time to post the issue on github for openxlsx
.
Tl;dr: it's not a bug, it's a built-in "problem" when importing from/exporting to Excel and is true for all such packages. The developer suggests exactly what the TO did in case Excel sheets contain formula: open the file in Excel first, save it and only then import it into R. Which doesn't answer the TO's question (which was how to open and save an Excel file automatically through R), but I'm posting this answer nonetheless, because it adds some helpful context.
https://github.com/ycphs/openxlsx/issues/261 and https://github.com/ycphs/openxlsx/issues/188#issuecomment-832591241
CodePudding user response:
Actually just found that I can use the "reticulate" package in R to run a python module for this purpose.
Thanks for your help everyone!