Home > Enterprise >  Loop over API calls in R using dataframe reference
Loop over API calls in R using dataframe reference

Time:08-13

I have a clinical trials management site I work with where I need to copy this number from one field to another field for a couple hundred protocols (individual pages on the site):

enter image description here

I have figured out how to use an API call to get everything from this particular page:

library(jsonlite)
library(httr)
library(crul)    
token<- "12345"
base <- "https://mywebsite.website.com"
endpoint <- "/website-api/rest/protocolManagementDetails/"
protocol <- "2506"   ##This is the 'protocolID' of that particular page

call2<-paste(base,endpoint, protocol, sep="")  

httpResponse <- GET(call2, add_headers(authorization = token))
results = fromJSON(content(httpResponse, "text"))

results

Which will return something like this:

enter image description here

And I know how to modify that result and push it in a new hospitalaccountno manually:

results$hospitalAccountNo<-"654321"
  
base <- "https://mywebsite.website.com"
endpoint <- "/website-api/rest/protocolManagementDetails/"
protocol <- "2506"

call2<-paste(base,endpoint, protocol, sep="") 

httpResponse <- PUT(call2, 
                    add_headers(authorization = token), body=results, encode = "json", verbose())

So if I were to go through one protocol at a time manually, that's easy! I'd plug the number for each protocol in, GET it, change the hospital account number, PUT it and call it a day. But I'd love to automate that process using the numbers from a dataframe like this:

df<-structure(list(PROTOCOL_ID = c(1, 22, 543, 421, 55, 6), PROTOCOL_NO = c("CTSU-E1234", 
"BRUOG-j-1234", "tp-P-bob61", "PFIZER-T", "Jimbo", 
"INCONGRUENCE"), LIBRARY = c("Non-Oncology", "Oncology", "Non-Oncology", 
"Oncology", "Oncology", "Non-Oncology")), row.names = c(NA, 6L), class = "data.frame")

So to summarize:

I'd love to loop through the 'df' dataframe, swapping in that df$protocol_id for the 'protocol' part of the api calls.
So the loop would 'GET' protocol 1, get the number from the internalaccountno, paste it into the hospitalaccountno, then 'PUT' protocol 1, then 'GET' and 'PUT' 22, then 543.. and so on until it runs out of numbers. Does that make sense?

CodePudding user response:

It seems like all you need to do is wrap the code you have in a function, then use something such as purrr::walk to iterate over all the protocol ids.

library(httr)

token<- "12345"
base <- "https://mywebsite.website.com"
endpoint <- "/website-api/rest/protocolManagementDetails/"

df <- structure(
  list(
    PROTOCOL_ID = c(1, 22, 543, 421, 55, 6), 
    PROTOCOL_NO = c("CTSU-E1234", "BRUOG-j-1234", "tp-P-bob61", "PFIZER-T", "Jimbo", "INCONGRUENCE"), 
    LIBRARY = c("Non-Oncology", "Oncology", "Non-Oncology", "Oncology", "Oncology", "Non-Oncology")
  ), 
  row.names = c(NA, 6L), 
  class = "data.frame"
)

UpdateAccountNumbers <- function(protocol){
  
  call2 <- paste(base,endpoint, protocol, sep="") 
  
  call2 <- paste(base,endpoint, protocol, sep="")  
  
  httpResponse <- GET(call2, add_headers(authorization = token))
  results <- fromJSON(content(httpResponse, "text"))
  
  results$hospitalAccountNo <- results$internalAccountNo
  
  call2 <- paste(base,endpoint, protocol, sep="") 
  
  httpResponse <- PUT(
    call2, 
    add_headers(authorization = token), 
    body=results, encode = "json", 
    verbose()
  )
}

purrr::walk(df$PROTOCOL_ID, UpdateAccountNumbers)

You just need to be aware if there is a request rate limit on the website. If you have a long list of protocols, you might exceed it when automating the process. There's likely some helper functions in httr that could clean up how you build the URLs but what you have should work (obviously I can't test since I don't have access to the actual site).

  • Related