SQL Server get a certain row value as a column value-CodePudding

I have a table like this one:

|  machine  |  type  |    value  |  id  |
 ----------- -------- ----------- ------ 
|     1     |   a    |     0.34  |   1  |
|     1     |   b    |     0.23  |   2  |
|     1     |   b    |     0.26  |   3  |
|     1     |   b    |     0.35  |   4  |
|     1     |   a    |     0.15  |   5  |
|     1     |   b    |     0.45  |   6  |
|     1     |   b    |     0.55  |   7  |

And I would like to obtain a table with the row with type "a" as column (id is not relevant in this case). For example:

|  machine  |  value | value_col |
 ----------- -------- ----------- 
|     1     |   0.23 |  0.34     |
|     1     |   0.26 |  0.34     |
|     1     |   0.35 |  0.34     |
|     1     |   0.45 |  0.15     |
|     1     |   0.55 |  0.15     |

I tried the SQL Server pivot function but I cannot obtain the desired result.

Any help?

THANKS!

CodePudding user response：

use CROSS APPLY() to obtain the last a value

select t.machine, t.value, v.value_col
from   tbl t
       cross apply
       (
           select top 1 value_col = value
           from   tbl x
           where  x.id < t.id
           and    x.type = 'a'
           order by id desc
       ) v
where  t.type = 'b'

dbfiddle

CodePudding user response：

It can be done in three steps, which makes me feel there are cleaner ways (but that I can't think of them)

create an identifier to relate all rows which should share an a value
use a window function to copy that a value to the rest of the partition
filter to include only the b rows

For example...

WITH
  a_partitioned AS
(
  SELECT
    *,
    SUM(CASE WHEN type='a' THEN 1 ELSE 0 END) OVER (PARTITION BY machine ORDER BY id)  AS a_partition
  FROM
    your_table
),
  a_value_spread AS
(
  SELECT
    *,
    MAX(CASE WHEN type = 'a' THEN value END) OVER (PARTITION BY machine, a_partition)  AS a_value
  FROM
    a_partitioned 
)
SELECT
  *
FROM
  a_value_spread
WHERE
  type = 'b'

Demo : https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=9037c0d3e6d46e05ecbcfdd3777808cd

EDIT

Here's my REALLY superficial attempt to experiment with which answer might have the lowest processing overheads...

https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=5433abb2181ff92be72cf9247d3ff0b0

In the correlated sub-query version, the scan of the data is 4% of the total cost

So, the whole query 'costs' ~20 index scans

In the analytic function version, the scan of the data is 12% of the total cost

So, the whole query 'costs' ~8.3 index scans

It's a very small data set though, and a very arbitrarily constructed data-set.

I'd run an actual profiler against your actual data
IF performance matters

(If the correlated sub-query approach is 'fast enough', definitely use that.)