I'm asking around for some help on an issue that I've been having recently involving downloading data from the internet. I frequently use the baseballr package to download csv data from Baseball Savant. It has worked exceptionally well for me before, but recently it has been acting weird. For instance, this would be a line of code that I would write from the package.
set = statcast_search(start_date = "2022-09-13", end_date = "2022-09-20", player_type = "pitcher")
This asks the function to return data from MLB pitchers from September 13 to September 20. Admittedly this is a large sample of data (about 30,000 objects), but it's worked perfectly fine for me before. Now, I'm getting this error below.
HTTP error 504.Error in value[[3L]](cond) : No payload acquired
From what I've gathered, this may be an issue with the server I'm running the code on. I have recently moved, and the issue arose afterward, so this may be a likely culprit. I'm just hoping someone may have a more direct answer for me or at least be able to point me to the right resources. Thank you!
CodePudding user response:
I think you're just requesting too much data at once.
Here's a solution that limits the API request to each day and then stacks your data, which should pull the exact same stats:
library(tidyverse)
library(baseballr)
statcast_bind_rows <- function(start_date, end_date, player_type) {
start <- as.Date(start_date)
end <- as.Date(end_date) - 1
range <- seq(start, end, "days")
range_offset <- seq(start 1, end 1, "days")
stats_list <- map2_df(range, range_offset, function(x, y) {
baseballr::statcast_search(
start_date = x,
end_date = y,
player_type = player_type
)
})
return(stats_list)
}
statcast_bind_rows(start_date = "2022-09-13", end_date = "2022-09-20", player_type = "pitcher")