Home > Software design >  Timestamp parsing in Hive
Timestamp parsing in Hive

Time:11-05

I'm trying to read a timestamp and reconstruct it in a different format as per my requirement, in Hive. But, I can't seem to get the month and date right. I suspect timestamp format to be incorrect but not sure where I'm going wrong.

Query

select from_unixtime(unix_timestamp("Sun Mar 28 19:51:10 GMT 05:30 2021", "EEE MMM dd HH:mm:ss z YYYY"), "HH:mm:ss dd-MM-YYYY-z")

Output

14:21:10 27-12-2021-UTC

CodePudding user response:

Read the SimpleDateFormat class docs which is used under the hood of unix_timestamp and from_unixtime.

y - is a year ---this is what you need Y - is week year ---this is what you used in a pattern

See also https://errorprone.info/bugpattern/MisusedWeekYear And some explanation what is week year here: https://docs.oracle.com/javase/7/docs/api/java/util/GregorianCalendar.html#week_year

The week year for short is the year that the relevant week number belongs to.

Fixed:

select from_unixtime(unix_timestamp("Sun Mar 28 19:51:10 GMT 05:30 2021", "EEE MMM dd HH:mm:ss z yyyy"), "HH:mm:ss dd-MM-yyyy-z")

Result:

14:21:10 28-03-2021-UTC
  • Related