Home > Enterprise >  Scalar subquery produced more than one element SQL, using SUM OVER PARTITION BY
Scalar subquery produced more than one element SQL, using SUM OVER PARTITION BY

Time:09-28

I am trying to query the running total of rains in 3 cities between exact years, but names are inconsistent in the original data and is not possible to clean from datasource. I have tried the following code, but keep getting the response "Scalar subquery produced more than one element".

  SELECT    
  Year,
  Place,
  TotalRain,
  FROM(SELECT EXTRACT(YEAR FROM date) as Year,
        REPLACE(location_name,(
               SELECT location_name FROM data.rain.pakistan
               WHERE LOWER (location_name) LIKE LOWER ('%lahore%')
         ),'Lahore'),
         REPLACE(location_name,(
               SELECT location_name FROM data.rain.pakistan
               WHERE LOWER (location_name) LIKE LOWER ('%karachi%')
         ),'Karachi'),
         REPLACE(location_name,(
               SELECT location_name FROM data.rain.pakistan
               WHERE LOWER (location_name) LIKE LOWER ('%islamabad%')
         ),'Islamabad'),
      
        location_name AS Place,
        SUM(mm_rain) OVER (PARTITION BY location_name ORDER BY EXTRACT(YEAR FROM date)) AS TotalRain
        FROM 
        data.rain.pakistan
  )  WHERE Place IN ('Lahore', 'Karachi', 'Islamabad')
  AND Year BETWEEN 2005 AND 2021                                                          
      GROUP BY Year, Place ,TotalRain             
       ORDER BY Year, Place

Please tell me if you know any solution!

CodePudding user response:

To avoid the errror you have to use aggregation in the subquery, since you will have many results. i.e.:

SELECT max(location_name) FROM data.rain.pakistan
           WHERE LOWER (location_name) LIKE LOWER ('%lahore%')

But i can't tell if the where clause will do as intended

CodePudding user response:

If I understood correctly and the goal is to have a running sum over the years for a location, also assuming the input table is flat, the query below should work:

SELECT    
  Year,
  Place,
  SUM(mm_rain) OVER (PARTITION BY Place ORDER BY Year) AS TotalRain
  FROM(
    SELECT 
        EXTRACT(YEAR FROM date) as Year,
        case 
          when lower(location_name) like '%lahore%' then 'Lahore'
          when lower(location_name) like '%karachi%' then 'Karachi'
          when lower(location_name) like '%islamabad%' then 'Islamabad'
        else location_name end as Place,
        mm_rain
      FROM input
  )  
WHERE Place IN ('Lahore', 'Karachi', 'Islamabad')
  AND Year BETWEEN 2005 AND 2021                                                          
GROUP BY Year, Place, mm_rain
ORDER BY Year, Place

I don't think subqueries are mandatory, but it depends on the input table.

  • Related