Home > Net >  How can I find documentation to explain how to_utc_timestamp defines Central Standard Time timezone
How can I find documentation to explain how to_utc_timestamp defines Central Standard Time timezone

Time:06-18

I am expecting UST to be always 6 hours ahead of CST. Apparently spark disagrees.

spark.sql("""select to_utc_timestamp('2022-02-26 17:35:36', "CST") """).show()

------------------- | time| ------------------- |2022-02-26 23:35:36| -------------------

spark.sql("""select to_utc_timestamp('2021-08-26 17:35:36', "CST") """).show()

------------------- | time| ------------------- |2021-08-26 22:35:36| -------------------

Is there some documentation that explains exactly how which timezone is calculated ? I could only find the enter image description here

CodePudding user response:

Generally speaking, the best way to understand exactly how Spark-SQL functions work is to dig through the source code, starting from org.apache.spark.sql.functions and following the breadcrumbs until you wind up here, which will show you that Spark is using the standard java.time.* classes under the hood.

This isn't going to help in your case, however, because the problem is your assumption. Java's TimeZone for CST respects daylight savings time, which will give you a different offset depending upon the DateTime you give it. You can provide a fixed offset yourself, e.g.

spark.sql("""select to_utc_timestamp('2022-02-26 17:35:36', "-0600") """).show()

but I'm unaware of any way to automatically ignore daylight savings in this context.

  • Related