Home > Mobile >  Add a column consisting of two existing columns in Athena
Add a column consisting of two existing columns in Athena

Time:03-24

I have a table consisting of 100 columns, two of which are dt and tm, which are partitions. looks something like this:

-------------------
| dt          tm  |
-------------------
| 2022/01/01  10  |
| 2022/01/02  15  |
| 2022/01/03  03  |
| 2022/01/04  20  |
-------------------

I want to be able to add another column, which will serve as a timestamp based in the "concationation" of dt and tm. That is:

--------------------------------------------
| dt          tm  timestamp                |
--------------------------------------------
| 2022/01/01  10  2022/01/01 10:00:00      |
| 2022/01/02  15  2022/01/02 15:00:00      |
| 2022/01/03  03  2022/01/03 03:00:00      |
| 2022/01/04  20  2022/01/04 20:00:00      |
--------------------------------------------

The table is created with Glue and updated every couple of hours using a crawler.

So far I tried using the ALTER TABLE ADD COLUMNS function but couldn't work it out since I can't add anything but the name of the new column and its type. What I need is to add a column that is computed for each new row added. Is that possible? Appreciate any help!

CodePudding user response:

Frankly, rather than adding another column, I would recommend that you create a view on the table that has an additional column.

This way, whenever data is added in the original table, the extra columns will have the data you expect. There will be no need to 'populate' the additional column.

If your fields are TEXT, you could use:

CREATE VIEW my_view AS
SELECT
  *,
  dt || ' ' || tm || ':00:00' as timestamp
FROM table

If your fields are DATE and INT and you'd like a TIMESTAMP output, use:

CREATE VIEW my_view AS
SELECT
  *,
  CAST(dt as TIMESTAMP)   tm * interval '1' hour as timestamp
FROM table

You can then access the table using the name of the View, such as:

SELECT * FROM my_view
WHERE timestamp < '2022-01-01'
  • Related