Home > Software design >  Finding the sum of a column based on range of a string column
Finding the sum of a column based on range of a string column

Time:06-22

I have a raw table like below:

Name DetailedName Position_Seqnum Position_name Value
A1 A1eng 0 00-Start here 1
A1 A1eng 53 08-Go left 2
A1 A1eng 60 99-Stop there 2
A1 A1eng 62 101-Go right 4
B1 B1eng 0 00- Start here 4
B1 B1eng 1 23-Go right 3
B1 B1eng 35 98- Stop there 6

Objective: For each Name, and for all the position_name between Start here and Stop there, I wish to return the total value of the Name.

The expected output is like:

Name DetailedName Value
A1 A1eng 5
B1 B1eng 13

Explanation of the expected output:

  1. A1 has position_name 00-Start here and 3rd row has position_name 99-Stop there.
  2. For all rows i.e (row1 row2 row3, position_seqnum 0, 53 and 60), I wish to extract the total value i.e 1 2 2 = 5.

My SQL attempt:

SELECT Name, DetailedName, Value
FROM `table` 
WHERE 
AND Position_name between
(trim(Position_name ) = REGEXP_EXTRACT(trim(Position_name ),r".*Start here.*")) 
AND
(trim(Position_name ) =  REGEXP_EXTRACT(trim(Position_name ),r".*Stop there.*"))
AND Value > 0 
order by Name
)
select * from cte

What I failed to achieve is how to apply between clause over a string column.

CodePudding user response:

You can solve this problem following these steps:

  • compute a running sum for each "Name", ordered by "Position_Seqnum"
  • apply a SELF JOIN where you match "start" rows with "stop" rows
  • subtract start running sum from stop running sum, then add up the start value

Here's the full query:

WITH cte AS (
    SELECT *,
           SUM(Value_) OVER(
                PARTITION BY Name 
                ORDER     BY Position_Seqnum
           )                            AS running_sum
    FROM tab
)
SELECT t1.Name, 
       t1.DetailedName,
       t2.running_sum - t1.running_sum   t1.Value_
FROM       cte t1
INNER JOIN cte t2
        ON t1.Name = t2.Name
       AND t1.DetailedName = t2.DetailedName
       AND t1.Position_Name LIKE '%Start%'
       AND t2.Position_Name LIKE '%Stop%'

Check the demo enter image description here

  • Related