I am loading data from an excel file to snowflake using pandas using the below code
import pandas as pd
import snowflake.connector as snowCtx
from snowflake.connector.pandas_tools import pd_writer,write_pandas
df=pd.read_excel(open(r'SampleSuperstore.xlsx', 'rb'),sheet_name='Orders')
df.columns = df.columns.str.upper()
print(df)
conn = snowCtx.connect(
user='username',
password='password',
account='account',
database='superstore',
schema='PUBLIC'
)
table_name='ORDERS'
success, nchunks, nrows, _ = write_pandas(conn,df,table_name,
chunk_size = 300,
schema = 'PUBLIC')
print(success, nchunks, nrows)
The date column in excel is as below
Order_Date Ship_Date
08-11-2016 11-11-2016
Even in pandas dataframe it is displaying the same
Order_Date Ship_Date
08-11-2016 11-11-2016
When I query the result in snowflake, I get the result as below
ORDER_DATE SHIP_DATE
1478563200000000 1478822400000000
If I change the datatype to date in snowflake it throws me an error stating variant datatype error.
CodePudding user response:
I'm not sure if there's a way to fix that in the chain from Excel to Snowflake, but it's a Unix epoch in microseconds and this can convert it:
set orderdate = 1478563200000000;
set shipdate = 1478822400000000;
select dateadd('us',$orderdate,'1970-01-01')::date ORDERDATE
,dateadd('us',$shipdate, '1970-01-01')::date SHIPDATE
;
CodePudding user response:
As a workaround, you may convert the dates to strings. Snowflake will convert them to date implicitly:
df['ORDER_DATE'] = df['ORDER_DATE'].astype(str)
df['SHIP_DATE'] = df['SHIP_DATE'].astype(str)
table_name='ORDERS'
success, nchunks, nrows, _ = write_pandas(conn,df,table_name,
chunk_size = 300,
schema = 'PUBLIC')