Home > Software design >  Creating a loop that calculates the weighted spread for ask and bid on a roundtrip of 800
Creating a loop that calculates the weighted spread for ask and bid on a roundtrip of 800

Time:06-22

I am a beginner at R (less than 2 years experience) and have been searching in the forums for an answer to my question.

I have a financial stock dataset similar to the following:

sign reference price total_volume type PM
1 80407_34260 16.78 100 bid 16.81
2 80407_34260 16.77 500 bid 16.81
3 80407_34260 16.75 1064 bid 16.81
-1 80407_34260 16.84 200 ask 16.81
-2 80407_34260 16.85 255 ask 16.81
-3 80407_34260 16.89 2200 ask 16.81
1 80407_34320 16.76 185 bid 16.795
2 80407_34320 16.75 1064 bid 16.795
3 80407_34320 16.74 100 bid 16.795
-1 80407_34320 16.83 100 ask 16.795
-2 80407_34320 16.84 1800 ask 16.795
-3 80407_34320 16.85 100 ask 16.795

Where PM is the midpoint.

I need to calculate the weighted relative spread for each reference for a roundtrip of 800 and 100.

The weighted spread is calculated as the difference between the wighted ask spread and the weighted bid spread onto the roundtrip.

A manual example for reference 80407_34260 in the 100 roundtrip would be:

-> S_A(100) = (100*16.85)/100 = 16.85

-> S_B(100) = (100*16.78)/100 = 16.78

And for the 800 roundtrip:

-> S_A(800) = (20016.84 25516.85 (800-255-200)*16.89)/800

-> S_B(800) = (10016.78 50016.77 (1064-500-100)*16.75)/800

So to weigh you have to multiply the total_volume * quote as long as it covers the roundtrip volume.

The relative spread would then be S_A(800) - S_B(800)/PM

So my trouble is that I have to create a function that contains a loop that goes through each reference and calculates both S_A and S_B for the concrete roundtrip. I want 2 columns, one for S_A and one for S_B per reference. I am not sure on how to create this and additionally, the thing is that references have different quantities of rows.

I also have the premise that if the sum of total_volumes for a reference is < roundtrip the full reference (and all its associated rows) should be deleted.

Maybe I need to create a counter for this?

I had thought of something like this:

vol_function = function(Stock, roundtrip, start, end){   totalcounter = 0
price = 0
for (i in start:end){
if(totalcounter < roundtrip){ 
   if(totalcounter   Stock$total_volume[i] > roundtrip){ 
      currentcounter = roundtrip - totalcounter}
   else{currentcounter = Stock$total_volume[i]}
price = price   (currentcounter * Stock$quote[i])
totalcounter = totalcounter   currentcounter}
}
return(price)}

And then:

get_results = function(stock){
start = 1
end = 10
data <- data.frame(matrix(ncol = 11, nrow = 0))
names <- c("Day","Hour", "Bid100", "Ask100", "Spread100",
           "Bid800", "Ask800", "Spread800")
colnames(data) <- names
while(end < nrow(stock)){
 bid100 = vol_function(stock, 100, start, end)
 bid800 = vol_function(stock, 800, start, end)     
 start = start   10
 end = end  10
 ask100 = vol_function(stock, 100, start, end)     
 ask800 = vol_function(stock, 800, start, end) 
 data = data %>% add_row(Day= stock$date[start], 
                         Hour = stock$time[start],
                         Bid100 = bid100,
                         Ask100 = ask100,
                         spread100 = ask100 - bid100,
                         Bid800 = bid800,
                         Ask800 = ask800,
                         spread800 = ask800 - bid800)                       
 start = start   10
 end = end   10}   
 return(data)}

CodePudding user response:

I think this will do what you're looking for.

It's a function that takes the data frame, groups it by reference and type (so subsequent calculations will be within those groups), calculates how much volume there was in prior observations, figures out how much of the current row should be included to get up to but no higher than the roundtrip total, and then multiplies that by the current price.

The last part is to summarize the total amount paid divided by roundtrip. moar_digits shows more digits, since the default printing will make all the prices in the example look like 16.8 or 16.9.

If you want, this could be further reshaped to put the ask and bid in a single row.

library(dplyr)
calc_roundtrip <- function(roundtrip) {
  df %>%
    group_by(reference, type) %>%
    mutate(prior_cuml_vol  = cumsum(lag(total_volume, default = 0)),
           cur_incl =  pmax(0,pmin(total_volume, (roundtrip - prior_cuml_vol))),
           paid = cur_incl * price) %>%
    summarize(total = sum(paid) / roundtrip) %>%
    mutate(moar_digits = formatC(total, digits = 6)) %>%
    mutate(spread = total - lag(total))
}

calc_roundtrip(800)


#`summarise()` has grouped output by 'reference'. You can override using the `.groups`
#argument.
## A tibble: 4 × 5
## Groups:   reference [2]
#  reference   type  total moar_digits  spread
#  <chr>       <chr> <dbl> <chr>         <dbl>
#1 80407_34260 ask    16.9 16.8648     NA     
#2 80407_34260 bid    16.8 16.7662     -0.0985
#3 80407_34320 ask    16.8 16.8388     NA     
#4 80407_34320 bid    16.8 16.7523     -0.0864
  •  Tags:  
  • r
  • Related