I'm trying to compare a theoretical model to historical data. My plan is to input the real world data into the equation from a few pandas data frames with Gekko(skip to #EQUATION#):
import pandas as pd
import pylab as plt
import numpy as np
from gekko import GEKKO
#DATA#
#oil supply
data = pd.read_html('https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=pet&s=mcrfpus2&f=m')
s=data[4]
sply = s.loc[s['Year'] >= 1984]
supplyvalues = sply.filter(items = ['Jan', 'Feb', 'Mar', 'Apr', 'May','Jun','Jul','Aug','Sep', 'Oct', 'Nov','Dec'])
val2 = supplyvalues.dropna()
protoval3 = supplyvalues.shift(1)
val3=protoval3.dropna()
#price of oil
p = pd.read_html('https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=EMA_EPM0_PTG_NUS_DPG&f=M')
Op=p[4]
prc = Op.loc[Op['Year'] >= 1984]
pricevalues = prc.filter(items = ['Jan', 'Feb', 'Mar', 'Apr', 'May','Jun','Jul','Aug','Sep', 'Oct', 'Nov','Dec'])
val1 = pricevalues.dropna()
#EQUATION#
m=GEKKO()
tm = np.linspace(0,10,50)
t=m.Param(value = tm)
m.time = tm
m.options.IMODE=4
#forming equation
a,b,c = m.Array(m.Param,3)
a.value = val1.values
b.value = val2.values
c.value = val3.values
x = m.Var()
m.Equation(a == c-a*b)
m.solve(disp=False)
Issue is that every time I try it, I get this error:
Exception: Data arrays must have the same length, and match time discretization in dynamic problems
I checked the shape of the data frames and they are equal, so I'm not sure what's going wrong. Kind of annoying since I feel like I'm really close. I would really appreciate any help you can give!
p.s: I'm still a beginner with programming so I'd also really appreciate if you can describe what your new lines of code do(if you have any). This is totally optional!
CodePudding user response:
The parameters val1
, val2
, and val3
are 38x12
matrices.
Jan Feb Mar Apr May ... Aug Sep Oct Nov Dec
1 0.906 0.902 0.907 0.929 0.934 ... 0.892 0.897 0.905 0.899 0.880
3 0.846 0.836 0.871 0.924 0.944 ... 0.940 0.919 0.908 0.917 0.919
4 0.893 0.805 0.654 0.591 0.638 ... 0.555 0.562 0.532 0.532 0.542
5 0.597 0.621 0.627 0.649 0.663 ... 0.716 0.705 0.697 0.694 0.674
6 0.649 0.633 0.625 0.660 0.684 ... 0.718 0.700 0.680 0.676 0.661
A few modifications are needed to give a successful solution.
- Select just one month or align the annual data to the time discretization points.
#forming equation
a,b,c = m.Array(m.Param,3)
a.value = val1['Jan'].values
b.value = val2['Jan'].values
c.value = val3['Jan'].values
If multiply months of simulation are needed then place each parameter into a vector of values.
- Align the time points with the inputs for
val1
,val2
, andval3
.
tm = np.linspace(0,10,38)
- Check the equations.
Each month has 38
time points so the time discretization also needs this many time points. The equation doesn't look complete - a dynamic simulation typically has differential equations such as x.dt()==-x a - b c
.
- The import needed the
lxml
package. Installed with:
pip install lxml
Here is a complete script that solves successfully, but isn't the intended simulation. Perhaps it is a good starting point for developing the simulation problem.
import pandas as pd
import pylab as plt
import numpy as np
from gekko import GEKKO
#DATA#
#oil supply
data = pd.read_html('https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=pet&s=mcrfpus2&f=m')
s=data[4]
sply = s.loc[s['Year'] >= 1984]
supplyvalues = sply.filter(items = ['Jan', 'Feb', 'Mar', 'Apr', 'May','Jun','Jul','Aug','Sep', 'Oct', 'Nov','Dec'])
val2 = supplyvalues.dropna()
protoval3 = supplyvalues.shift(1)
val3=protoval3.dropna()
#price of oil
p = pd.read_html('https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=EMA_EPM0_PTG_NUS_DPG&f=M')
Op=p[4]
prc = Op.loc[Op['Year'] >= 1984]
pricevalues = prc.filter(items = ['Jan', 'Feb', 'Mar', 'Apr', 'May','Jun','Jul','Aug','Sep', 'Oct', 'Nov','Dec'])
val1 = pricevalues.dropna()
print(np.shape(val1.values))
#EQUATION#
m=GEKKO(remote=False)
tm = np.linspace(0,10,38)
print(len(tm))
t=m.Param(value = tm)
m.time = tm
m.options.IMODE=4
#forming equation
a,b,c = m.Array(m.Param,3)
a.value = val1['Jan'].values
b.value = val2['Jan'].values
c.value = val3['Jan'].values
x = m.Var()
m.Equation(x == c-a*b)
m.solve(disp=True)