Home > front end >  Plotting Gekko Model With Pandas Datafram
Plotting Gekko Model With Pandas Datafram

Time:06-19

I'm trying to compare a theoretical model to historical data. My plan is to input the real world data into the equation from a few pandas data frames with Gekko(skip to #EQUATION#):

import pandas as pd
import pylab as plt
import numpy as np
from gekko import GEKKO

#DATA#
#oil supply
data = pd.read_html('https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=pet&s=mcrfpus2&f=m')
s=data[4]
sply = s.loc[s['Year'] >= 1984]
supplyvalues = sply.filter(items = ['Jan', 'Feb', 'Mar', 'Apr', 'May','Jun','Jul','Aug','Sep', 'Oct', 'Nov','Dec'])
val2 = supplyvalues.dropna()
protoval3 = supplyvalues.shift(1)
val3=protoval3.dropna()
#price of oil
p = pd.read_html('https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=EMA_EPM0_PTG_NUS_DPG&f=M')
Op=p[4]
prc = Op.loc[Op['Year'] >= 1984]
pricevalues = prc.filter(items = ['Jan', 'Feb', 'Mar', 'Apr', 'May','Jun','Jul','Aug','Sep', 'Oct', 'Nov','Dec'])
val1 = pricevalues.dropna()

#EQUATION#
m=GEKKO()
tm = np.linspace(0,10,50)
t=m.Param(value = tm)
m.time = tm
m.options.IMODE=4

#forming equation
a,b,c = m.Array(m.Param,3)
a.value = val1.values
b.value = val2.values
c.value = val3.values
x = m.Var()
m.Equation(a == c-a*b)
m.solve(disp=False)

Issue is that every time I try it, I get this error:

Exception: Data arrays must have the same length, and match time discretization in dynamic problems

I checked the shape of the data frames and they are equal, so I'm not sure what's going wrong. Kind of annoying since I feel like I'm really close. I would really appreciate any help you can give!

p.s: I'm still a beginner with programming so I'd also really appreciate if you can describe what your new lines of code do(if you have any). This is totally optional!

CodePudding user response:

The parameters val1, val2, and val3 are 38x12 matrices.

      Jan    Feb    Mar    Apr    May  ...    Aug    Sep    Oct    Nov    Dec
1   0.906  0.902  0.907  0.929  0.934  ...  0.892  0.897  0.905  0.899  0.880
3   0.846  0.836  0.871  0.924  0.944  ...  0.940  0.919  0.908  0.917  0.919
4   0.893  0.805  0.654  0.591  0.638  ...  0.555  0.562  0.532  0.532  0.542
5   0.597  0.621  0.627  0.649  0.663  ...  0.716  0.705  0.697  0.694  0.674
6   0.649  0.633  0.625  0.660  0.684  ...  0.718  0.700  0.680  0.676  0.661

A few modifications are needed to give a successful solution.

  1. Select just one month or align the annual data to the time discretization points.
#forming equation
a,b,c = m.Array(m.Param,3)
a.value = val1['Jan'].values
b.value = val2['Jan'].values
c.value = val3['Jan'].values

If multiply months of simulation are needed then place each parameter into a vector of values.

  1. Align the time points with the inputs for val1, val2, and val3.
tm = np.linspace(0,10,38)
  1. Check the equations.

Each month has 38 time points so the time discretization also needs this many time points. The equation doesn't look complete - a dynamic simulation typically has differential equations such as x.dt()==-x a - b c.

  1. The import needed the lxml package. Installed with:
pip install lxml

Here is a complete script that solves successfully, but isn't the intended simulation. Perhaps it is a good starting point for developing the simulation problem.

import pandas as pd
import pylab as plt
import numpy as np
from gekko import GEKKO

#DATA#
#oil supply
data = pd.read_html('https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=pet&s=mcrfpus2&f=m')
s=data[4]
sply = s.loc[s['Year'] >= 1984]
supplyvalues = sply.filter(items = ['Jan', 'Feb', 'Mar', 'Apr', 'May','Jun','Jul','Aug','Sep', 'Oct', 'Nov','Dec'])
val2 = supplyvalues.dropna()
protoval3 = supplyvalues.shift(1)
val3=protoval3.dropna()
#price of oil
p = pd.read_html('https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=EMA_EPM0_PTG_NUS_DPG&f=M')
Op=p[4]
prc = Op.loc[Op['Year'] >= 1984]
pricevalues = prc.filter(items = ['Jan', 'Feb', 'Mar', 'Apr', 'May','Jun','Jul','Aug','Sep', 'Oct', 'Nov','Dec'])
val1 = pricevalues.dropna()

print(np.shape(val1.values))

#EQUATION#
m=GEKKO(remote=False)
tm = np.linspace(0,10,38)
print(len(tm))
t=m.Param(value = tm)
m.time = tm
m.options.IMODE=4

#forming equation
a,b,c = m.Array(m.Param,3)
a.value = val1['Jan'].values
b.value = val2['Jan'].values
c.value = val3['Jan'].values
x = m.Var()
m.Equation(x == c-a*b)
m.solve(disp=True)
  • Related