Please help understand why the date_format
does not extract 08:15
for 8:15am?
spark.sql("select date_format(date '1994-11-05T08:15:30-05:00', 'hh:mm') AS hour_in_day_number").show()
------------------
|hour_in_day_number|
------------------
| 12:00|
------------------
spark.sql("select date_format(date '1994-11-05T08:15:30-05:00', 'HH:mm') AS hour_in_day_number").show()
------------------
|hour_in_day_number|
------------------
| 00:00|
------------------
spark.sql("select date_format(date '1994-11-05T08:15:30-05:00', 'kk:mm') AS hour_in_day_number").show()
------------------
|hour_in_day_number|
------------------
| 24:00|
------------------
Examples
1994-11-05T08:15:30-05:00 corresponds to November 5, 1994, 8:15:30 am, US Eastern Standard Time.
Datetime Patterns for Formatting and Parsing
Symbol | Meaning | Presentation | Examples |
---|---|---|---|
G | era | text | AD; Anno Domini |
y | year | year | 2020; 20 |
D | day-of-year | number(3) | 189 |
M/L | month-of-year | month | 7; 07; Jul; July |
d | day-of-month | number(3) | 28 |
Q/q | quarter-of-year | number/text | 3; 03; Q3; 3rd quarter |
E | day-of-week | text | Tue; Tuesday |
F | aligned day of week in month | number(1) | 3 |
a | am-pm-of-day | am-pm | PM |
h | clock-hour-of-am-pm (1-12) | number(2) | 12 |
K | hour-of-am-pm (0-11) | number(2) | 0 |
k | clock-hour-of-day (1-24) | number(2) | 0 |
H | hour-of-day (0-23) | number(2) | 0 |
m | minute-of-hour | number(2) | 30 |
s | second-of-minute | number(2) | 55 |
S | fraction-of-second | fraction | 978 |
V | time-zone ID | zone-id | America/Los_Angeles; Z; -08:30 |
z | time-zone name | zone-name | Pacific Standard Time; PST |
O | localized zone-offset | offset-O | GMT 8; GMT 08:00; UTC-08:00; |
X | zone-offset ‘Z’ for zero | offset-X | Z; -08; -0830; -08:30; -083015; -08:30:15; |
x | zone-offset | offset-x | 0000; -08; -0830; -08:30; -083015; -08:30:15; |
Z | zone-offset | offset-Z | 0000; -0800; -08:00; |
Environment
$ spark-submit --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.1.2
/_/
Using Scala version 2.12.10, OpenJDK 64-Bit Server VM, 1.8.0_312
Branch HEAD
Compiled by user centos on 2021-05-24T04:27:48Z
Revision de351e30a90dd988b133b3d00fa6218bfcaba8b8
Url https://github.com/apache/spark
Type --help for more information.
CodePudding user response:
You used date
, which only keep year, month and day.
You can try use tiemstamp
as below:
scala> spark.sql("select date_format(timestamp '1994-11-05T08:15:30-05:00', 'hh:mm') AS hour_in_day_number").show()
------------------
|hour_in_day_number|
------------------
| 05:15|
------------------
You can refer https://databricks.com/blog/2020/07/22/a-comprehensive-look-at-dates-and-timestamps-in-apache-spark-3-0.html