I'm would like to get the SUM of each column by years. Rather then displays several individual rows for the same year.
spark.sql(""" SELECT YEAR(date) AS year, useful, funny, cool FROM reviews_without_text_table ORDER by year ASC; """).show(truncate=False)
[enter image description here][1]
Please view the attatchment here: [1]: https://i.stack.imgur.com/nYIrA.png
CodePudding user response:
use this
spark.sql("""
SELECT YEAR(date) AS year,
sum(useful) useful,sum(funny) funny,sum(cool) cool
FROM reviews_without_text_table
GROUP BY YEAR(date)
ORDER by year ASC;
""").show(truncate=False)