R to Python stochastic process translation-CodePudding

I want to translate the R code below into Python.

It is mainly a stochastic process that i need to translate it into Python.

The code implements a markov chain simulation of a jump process with two volatility stages.

set.seed(42)

nSim   <- 1E5
tau    <- 3
K      <- 105
S0     <- 100
rf     <- 0.05
vol_lo <- 0.25
vol_hi <- 0.75
lambda <- c(3,2) # away-from-lo, away-from-hi

sim_time_in_lo <- function(state0){
  t <- 0
  s <- state0
  time_lo <- 0
  while (t<tau){
    dt <- rexp(n=1,lambda[s])
    if ((t dt)>tau){ dt <- tau - t}
    if (s==1){time_lo <- time_lo   dt }
    if (s==1){s<-2} else {s <-1}
    t <- t  dt
  }
  time_lo
}
tau_lo     <- sapply(1:nSim,function(i){sim_time_in_lo(1)})
tau_hi     <- tau - tau_lo
total_var  <- tau_lo * vol_lo^2   tau_hi * vol_hi^2
drift      <- rf * tau-0.5*total_var
randomness <- sqrt(total_var)*rnorm(nSim,)

the Python attempt is the following:

I think that i am mis using the sapply like function in pandas.

import numpy as np
import pandas as pd


np.random.seed(42)

nSim   = 1000
tau    = 3
K      = 105
S0     = 100
r      = 0.05
vol_lo = 0.25
vol_hi = 0.75
lambd = [3,2] # away-from-lo, away-from-hi

def sim_time_in_lo(state0):
    t = 0
    s = state0
    time_lo = 0
    while (t<tau):
        dt = np.random.exponential(scale = lambd[s],size = nSim)
        if (t dt)>tau:
            dt = tau - t
        elif s == 1:
                time_lo = time_lo   dt
        elif s == 1:
            s = 2
        else : s = 1
        t = t  dt
        time_lo

df = pd.DataFrame([sim_time_in_lo(1) for i in range(1,nSim)]);df
tau_hi     = tau - tau_lo
total_var  = tau_lo * vol_lo**2   tau_hi * vol_hi**2
drift      = r * tau-0.5*total_var
randomness = np.sqrt(total_var)*np.random.normal(size=nSim)

but i am receiving an error on

df = pd.DataFrame([sim_time_in_lo(1) for i in range(1,nSim)]);df

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

What is my mistake ?

CodePudding user response：

Here are some improvements to make it run:

nSim = int(1e5)
tau = 3
K = 105
S0 = 100
r = 0.05
vol_lo = 0.25
vol_hi = 0.75
lambd = [3, 2]  # away-from-lo, away-from-hi


def sim_time_in_lo(state0):
    t = 0
    s = state0
    time_lo = 0
    while t < tau:
        dt = np.random.exponential(size=1, scale=1/lambd[s])
        if (t   dt) > tau:
            dt = tau - t
        if s == 0:
            time_lo = time_lo   dt
            s = 1
        else:
            s = 0
        t = t   dt
    if isinstance(time_lo, np.ndarray):
        return time_lo[0]
    return time_lo


tau_lo = np.array([sim_time_in_lo(0) for i in range(nSim)])
tau_hi = tau - tau_lo
total_var = tau_lo * vol_lo ** 2   tau_hi * vol_hi ** 2
drift = r * tau - 0.5 * total_var
randomness = np.sqrt(total_var) * np.random.normal(size=nSim)

nSim is 1e5, not 1000. This syntax works in Python too but you need to convert to int.
Python uses zero-indexing, unlike R, so the states need to be 0 and 1 to fetch values from lambd.
The size parameter for the exponential should be 1, to return a scalar, not nSim - this is why you were getting an error.
You have to use an explicit return statement, unlike in R (otherwise Python functions return None).
Fixed some weird indentation which also caused issues I think.
Initialise time_lo as a size-1 array to begin with, otherwise sometimes you get scalars back. Then at the end you extract the scalars from the size-1 arrays.
Use an array instead of a Dataframe. (No need to import pandas now.)
NB np.random.exponential's scale parameter is $\beta$, ie $\frac{1}{\lambda}$. R's rexp's rate is $\lambda$.

CodePudding user response：

Your error is in the condition if (t dt)>tau. dt is an array which makes (t dt)>tau an array of boolean values.

Use ((t dt)>tau).any() or ((t dt)>tau).all() to give the if statement meaning. .all() means you want every value of t dt to be greater than tau, and .any() means if just one value is greater, that's enough.