Home > database >  How do I correctly load data in Python?
How do I correctly load data in Python?

Time:10-02

I am trying to replicate @miabrahams ACM model which is on Github here: https://github.com/miabrahams/PricingTermStructure

I am coming across two errors relating to how I load data. I'm a Python novice so I'm sure it's a simple solution but I can't figure out how to fix this problem.

The problem is this:

In their file for loading the data load_gsw.py, they define a function:

def load_gsw(filename: str, n_maturities: int):
    data = pd.read_excel(filename, parse_dates=[0])
    data = data.set_index('Date')
    data = data.resample('BM').last() # Convert to EOM observations
    data.index = data.index   DateOffset() # Add one day
    plot_dates = pd.DatetimeIndex(data.index).to_pydatetime() # pydatetime is best for matplotlib

Leaving the term filename in place yields an error when I run the script so I am pretty sure that I need to substitute filename with the data they've provided. Therefore I added a line like this:

filename = '/Users/SystemError/Desktop/Python/gsw_ns_params.xlsx'

def load_gsw(filename: str, n_maturities: int):
    data = pd.read_excel(filename, parse_dates=[0])
    data = data.set_index('Date')
    data = data.resample('BM').last()  # Convert to EOM observations
    data.index = data.index   DateOffset()  # Add one day
    plot_dates = pd.DatetimeIndex(data.index).to_pydatetime()  # pydatetime is best for matplotlib

However, then in their script for running the model, PricingTermStructure.ipynb they use the function described above to parse data in a different way:

from load_gsw import * 

rawYields, plot_dates = load_gsw('data/gsw_ns_params.xlsx', n_maturities)
t = rawYields.shape[0] - 1  # Number of observations

I have tried not defining filename and also swapping 'data/gsw_ns_params.xlsx' with '/Users/SystemError/Desktop/Python/gsw_ns_params.xlsx' but I keep getting the same error:

  File "/Users/SystemError/Desktop/Python/acm model.py", line 60, in <module>
    rawYields, plot_dates = load_gsw('data/gsw_ns_params.xlsx', n_maturities)

Any idea what I'm doing wrong? Thanks in advance for whatever assistance you can provide!

CodePudding user response:

You have to use it like this:

filename = '/Users/SystemError/Desktop/Python/gsw_ns_params.xlsx'
rawYields, plot_dates = load_gsw(filename, n_maturities)

filename in the first code block in your example is a function argument and it is local to the function. It is meant to be replaced when called with an actual value or another variable. You have defined a global variable filename so you have to use it in the function call. That or just put the path in there like this:

rawYields, plot_dates = load_gsw('/Users/SystemError/Desktop/Python/gsw_ns_params.xlsx', n_maturities)
  • Related