At my company we have a SQL database that we connect to via pyodbc. Here, when we run a query, pyodbc returns a list of tuples. Typically, such a list contains >10 tuples. Here is an example of a possible output we get:
OUTPUT =
[(datetime.datetime(2003, 3, 26, 15, 12, 15), '490002_space'),
(datetime.datetime(2003, 3, 27, 16, 13, 14), '490002_space')]
My wish is to remove '_space' and 'datetime.datetime(...)' parts in all tuples we receive in OUTPUT. Eventually I hope to pass the new list of tuples to a pandas dataframe. I was hoping to get any of your advice on how to efficiently modify OUTPUT to DESIRED_OUTPUT:
DESIRED_OUTPUT:
[('2003, 3, 26, 15, 12, 15', '490002'),
('2003, 3, 27, 16, 13, 14', '490002')]
Really hope to hear from you.
Greetings, Jerome
CodePudding user response:
Here is a possible solution:
result = [(str(dt.timetuple()[:6])[1:-1], s.split('_')[0]) for dt, s in OUTPUT]
CodePudding user response:
Eventually I hope to pass the new list of tuples to a pandas dataframe.
You can use .read_sql_query()
to pull the information directly into a DataFrame:
import pandas as pd
import sqlalchemy as sa
connection_url = sa.engine.URL.create(
"mssql pyodbc",
username="scott",
password="tiger^5HHH",
host="192.168.0.199",
database="test",
query={
"driver": "ODBC Driver 18 for SQL Server",
"TrustServerCertificate": "Yes",
}
)
engine = sa.create_engine(connection_url)
table_name = "so71297370"
# set up example environment
with engine.begin() as conn:
conn.exec_driver_sql(f"DROP TABLE IF EXISTS {table_name}")
conn.exec_driver_sql(f"CREATE TABLE {table_name} (col1 datetime2, col2 nvarchar(50))")
conn.exec_driver_sql(f"""\
INSERT INTO {table_name} (col1, col2) VALUES
('2003-03-26 15:12:15', '490002_space'),
('2003-03-27 16:13:14', '490002_space')
""")
# example
df = pd.read_sql_query(
# suffix '_space' is 6 characters in length
f"SELECT col1, LEFT(col2, LEN(col2) - 6) AS col2 FROM {table_name}",
engine,
)
print(df)
"""
col1 col2
0 2003-03-26 15:12:15 490002
1 2003-03-27 16:13:14 490002
"""