I am using SQLAlchemy to read data from a SQL Server database and then turning the table data to a csv file to later hand off.
However, I noticed when there is a 1 or 0 in a SQL Server table field, the csv output has True or False instead. I know to Python, it's still a number since True == 1
returns True
Is it possible to not have this happen from the get go and have the csv keep 1 or 0?
This my current code:
import pandas as pd
from sqlalchemy import create_engine
pd.set_option('display.max_columns', None)
pd.set_option('max_colwidth', None)
df = pd.read_csv(r'''C:\Users\username\Downloads\py_tma_tables.csv''')
tma_table = df['table_name'].tolist()
servername = 'SERVER'
dbname = 'DATABASE'
sqlcon = create_engine('mssql pyodbc://@' servername '/' dbname '?driver=ODBC Driver 17 for SQL Server')
df_list = []
count = 0
while count < 1:
df1 = pd.read_sql_query("SELECT * FROM " tma_table[count], sqlcon)
df_list.append(df1)
df_list[count].to_csv(tma_table[count] ".csv", index=False, header=None, encoding='utf-8')
count = 1
I have about 450 tables that this would be applied to so single table solutions aren't going to work as I need an automated way to have every table follow this rule I'm aiming for.
I started to go down the route of trying to loop through each column and change the dtype of the column but it seems easier to not have to replace True or False to 1 or 0 in the first place.
dtypes output:
cst_pk int64
cst_code object
cst_name object
cst_clnt_fk int64
cst_active bool
cst_encumbered object
cst_purgeDate object
cst_splitBill bool
cst_subLabor_fk object
cst_subParts_fk float64
cst_subOther_fk object
cst_subContract bool
cst_subContractLabor_fk object
cst_subContractParts_fk object
cst_subContractOther_fk object
cst_balanceType object
cst_normalBalance object
cst_ay_fk int64
cst_header bool
cst_beginningBalance object
cst_alias object
cst_modifier_fk float64
cst_modifiedDate datetime64[ns]
cst_creator_fk float64
cst_createddate datetime64[ns]
cst_curr_fk object
cst_exch_fk object
cst_exch_date object
cst_ag_fk object
cst_dp_fk float64
cst_alternateAccount object
dtype: object
SQL CREATE TABLE query:
CREATE TABLE [dbo].[f_account](
[cst_pk] [int] NOT NULL,
[cst_code] [nvarchar](100) NOT NULL,
[cst_name] [nvarchar](35) NOT NULL,
[cst_clnt_fk] [int] NOT NULL,
[cst_active] [bit] NOT NULL,
[cst_encumbered] [decimal](10, 2) NULL,
[cst_purgeDate] [datetime] NULL,
[cst_splitBill] [bit] NOT NULL,
[cst_subLabor_fk] [int] NULL,
[cst_subParts_fk] [int] NULL,
[cst_subOther_fk] [int] NULL,
[cst_subContract] [bit] NOT NULL,
[cst_subContractLabor_fk] [int] NULL,
[cst_subContractParts_fk] [int] NULL,
[cst_subContractOther_fk] [int] NULL,
[cst_balanceType] [nvarchar](20) NULL,
[cst_normalBalance] [nvarchar](20) NULL,
[cst_ay_fk] [int] NULL,
[cst_header] [bit] NOT NULL,
[cst_beginningBalance] [decimal](10, 2) NULL,
[cst_alias] [nvarchar](32) NULL,
[cst_modifier_fk] [int] NULL,
[cst_modifiedDate] [datetime] NULL,
[cst_creator_fk] [int] NULL,
[cst_createddate] [datetime] NULL,
[cst_curr_fk] [int] NULL,
[cst_exch_fk] [int] NULL,
[cst_exch_date] [datetime] NULL,
[cst_ag_fk] [int] NULL,
[cst_dp_fk] [int] NULL,
[cst_alternateAccount] [nvarchar](100) NULL
CodePudding user response:
As explained in
https://github.com/mkleehammer/pyodbc/issues/383
pyodbc returns bit
columns as boolean values because that is what most people use bit
columns for.
import pandas as pd
import sqlalchemy as sa
engine = sa.create_engine("mssql pyodbc://scott:tiger^5HHH@mssql_199")
table_data = """\
SELECT -1 AS id, CAST(NULL AS bit) AS bit_col
UNION ALL
SELECT 0 AS id, CAST(0 AS bit) AS bit_col
UNION ALL
SELECT 1 AS id, CAST(1 AS bit) AS bit_col
"""
df = pd.read_sql_query(table_data, engine)
print(df)
"""
id bit_col
0 -1 None
1 0 False
2 1 True
"""
If you want pyodbc to return bit
columns as some other type you can use an output converter function as illustrated in the GitHub issue. The trick is getting SQLAlchemy to use it. That is done using an event listener.
import pandas as pd
import pyodbc
import sqlalchemy as sa
engine = sa.create_engine("mssql pyodbc://scott:tiger^5HHH@mssql_199")
table_data = """\
SELECT -1 AS id, CAST(NULL AS bit) AS bit_col
UNION ALL
SELECT 0 AS id, CAST(0 AS bit) AS bit_col
UNION ALL
SELECT 1 AS id, CAST(1 AS bit) AS bit_col
"""
def handle_bit_type(bit_value):
return bit_value
@sa.event.listens_for(engine, "connect")
def connect(conn, rec):
conn.add_output_converter(pyodbc.SQL_BIT, handle_bit_type)
df = pd.read_sql_query(table_data, engine)
print(df)
"""
id bit_col
0 -1 None
1 0 b'\x00'
2 1 b'\x01'
"""
Edit: Or, if you use
def handle_bit_type(bit_value):
if bit_value is None:
rtn = None
elif bit_value == b"\x00":
rtn = "0"
else:
rtn = "1"
return rtn
you'll get
df = pd.read_sql_query(table_data, engine)
print(df)
"""
id bit_col
0 -1 None
1 0 0
2 1 1
"""