Estoy tratando de ajustar los datos de la serie de tiempo en un proceso de Ornstein-Uhlenbeck. Aquí está mi código hasta ahora:
# source for computation: https://arxiv.org/pdf/1411.5062.pdf
import math
from math import sqrt, exp, log # exp(n) == e^n, log(n) == ln(n)
import scipy.optimize as so
import numpy as np
def __compute_log_likelihood(params, *args):
'''
Compute the average Log Likelihood, this function will by minimized by scipy.
Find in (2.2) in linked paper
returns: the average log likelihood from given parameters
'''
# functions passed into scipy's minimize() needs accept one parameter, a tuple of
# of values that we adjust to minimize the value we return.
# optionally, *args can be passed, which are values we don't change, but still want
# to use in our function (e.g. the measured heights in our sample or the value Pi)
theta, mu, sigma = params
X, dt = args
n = len(X)
sigma_tilde_squared = sigma ** 2 * (1 - exp(-2 * mu * dt)) / 2 * mu
summation_term = 0
for i in range(1, len(X)):
summation_term += (X[i] - X[i - 1] * exp(-mu * dt) - theta * (1 - exp(-mu * dt))) ** 2
summation_term = -summation_term / (2 * n * sigma_tilde_squared)
log_likelihood = (-log(2 * math.pi) / 2) + (-log(sqrt(sigma_tilde_squared))) + summation_term
return -log_likelihood
# since we want to maximize this total log likelihood, we need to minimize the
# negation of the this value (scipy doesn't support maximize)
def estimate_coefficients_MLE(X, dt):
'''
Estimates Ornstein-Uhlenbeck coefficients (, µ, ) of the given array
using the Maximum Likelihood Estimation method
input: X - array-like data to be fit as an OU process
returns: , µ, , Total Log Likelihood
'''
bounds = ((0, None), (None, None), (0, None)) # theta > 0, mu , sigma > 0
mu_init = np.mean(X)
result = so.minimize(__compute_log_likelihood, (1e-6, 1e-6, 1e-6), args=(X, dt), bounds=bounds)
theta, mu, sigma = result.x
max_log_likelihood = -result.fun # undo negation from __compute_log_likelihood
return theta, mu, sigma, max_log_likelihood
Pero cuando simulo un proceso de OU con lo siguiente:
# simulate Ornstein-Uhlenbeck Process
import numpy as np
import matplotlib.pyplot as plt
t_0 = 0 # define model parameters
t_end = 2
length = 1000
theta = 1.1
mu = 0
sigma = 0.3
t = np.linspace(t_0,t_end,length) # define time axis
dt = np.mean(np.diff(t))
y = np.zeros(length)
y0 = np.random.normal(loc=0.0,scale=1.0) # initial condition
drift = lambda y,t: theta*(mu-y) # define drift term, google to learn about lambda
diffusion = lambda y,t: sigma # define diffusion term
noise = np.random.normal(loc=0.0,scale=1.0,size=length)*np.sqrt(dt) #define noise process
# solve SDE
for i in range(1,length):
y[i] = y[i-1] + drift(y[i-1],i*dt)*dt + diffusion(y[i-1],i*dt)*noise[i]
plt.plot(t,y)
plt.show()
A continuación, ajuste los datos (almacenados en y) utilizando mi función con:
theta, mu, sigma, max_ll = estimate_coefficients_MLE(y, 1/len(y))
Me sale un "Error de valor: error de dominio matemático" o mis coeficientes están muy desviados. Si alguien pudiera indicarme la dirección correcta, estaría muy agradecido, hay una falta de recursos en línea sobre este tema.