I have a Python script I created which pulls data from a government website, formats the data, then dumps the data into an Access table.
I'm using Sqlalchemy and pyodbc to import the data; however, if my data has an integer column at all, I get the dreaded "pyodbc.Error: ('HYC00', '[HYC00] [Microsoft][ODBC Access Database Driver]Optional feature not implemented (0) (SQLBindParameter)')" error message.
Does anyone know of any way around this error which would allow me to import my data already formatted, even if a column has been formatted to integers? I understand the way around this is to format the column to float, but I don't want float. Are there any other options?
Here is my code for testing:
import pandas as pd
from pandas import DataFrame
import numpy as np
import re as re
import pyodbc
from sqlalchemy import create_engine
# Download zipfile from BOEM website, extract and save file to temp folder
from io import BytesIO
from urllib.request import urlopen
from zipfile import ZipFile
zipurl = 'https://www.data.boem.gov/Well/Files/5010.zip'
with urlopen(zipurl) as zipresp:
with ZipFile(BytesIO(zipresp.read())) as zfile:
zfile.extractall('/temp/leasedata')
# Import fixed field file to Pandas for formatting
# Define column spacing of fixed field file
colspecs = [(0, 12), (13, 17), (18, 26), (27, 31), (31, 37), (39, 47), (47, 53), (57, 62), (62, 67),
(67, 73), (84, 86), (86, 92), (104, 106), (106, 112), (112, 120), (120, 128), (131, 134), (134, 139),
(140, 155), (156, 171), (172, 187), (188, 203), (203, 213)]
df = pd.read_fwf('/temp/leasedata/5010.DAT', colspecs=colspecs, header=None)
# Add column headers
df.columns = ['API', 'WellName', 'Suffix', 'OprNo', 'BHFldName', 'SpudDate', 'BtmOCSLse', 'RKBElev', 'TotalMD',
'TVD', 'SurfArea', 'SurfBlock', 'BHArea', 'BHBlock', 'TDDate', 'StatusDate', 'StatusCode', 'WaterDepth',
'SurfLon', 'SurfLat', 'BHLon', 'BHLat', 'SurfOCSLse']
# Load dataframe into new temp table in database
# Connect to OOSA Access database. Make sure to create a User DSN directly to OOSA database before running script
conn = create_engine("access pyodbc://@OOSA")
print(df)
df.to_sql('borehole_temp_table', conn, if_exists='replace')
Thanks for any assistance!
CodePudding user response:
I understand the way around this is to format the column to float, but I don't want float. Are there any other options?
From the sqlalchemy-access wiki:
Workarounds include saving the column as ShortText …
import sqlalchemy_access as sa_a
# …
df = pd.DataFrame(
[
(12345678901,),
(-12345678901,),
],
columns=["column1"],
)
df["column1"] = df["column1"].astype(str)
dtype_dict = {'column1': sa_a.ShortText(20)}
df.to_sql("my_table", engine, index=False, if_exists="replace", dtype=dtype_dict)
… or as Decimal
df = pd.DataFrame(
[
(12345678901,),
(-12345678901,),
],
columns=["column1"],
)
df["column1"] = df["column1"].astype(str) # still need to convert the column to string!
dtype_dict = {'column1': sa_a.Decimal(19, 0)}
df.to_sql("my_table", engine, index=False, if_exists="replace", dtype=dtype_dict)