I need to return the first and last record of each group of sets.
DATA:
Code Close Time
USD 146116 2022-04-03 04:00:00.00 00:00
EUR 241789 2022-03-27 17:00:00.00 00:00
EUR 241807 2022-03-27 08:00:00.00 00:00
USD 141800 2022-03-27 08:00:00.00 00:00
USD 140809 2022-03-27 07:00:00.00 00:00
T-SQL:
SELECT
[Code]
,DATEADD(dd, 0, DATEDIFF(dd, 0, [Time]))
,FIRST_VALUE([Close]) OVER (ORDER BY [Time] ASC) AS [Open] -- Column 'Time' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
,MAX([Close]) AS [High]
,MIN([Close]) AS [Low]
,LAST_VALUE([Close]) OVER (ORDER BY [Time] ASC) AS [Close] -- Column 'Time' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
FROM [CurrencyIntradayHistories]
GROUP BY [Code],
DATEADD(dd, 0, DATEDIFF(dd, 0, [Time]))
ORDER BY [Time] ASC
Desired result:
Code Open High Low Close Time
EUR 241807 241807 241789 241789 2022-03-27
USD 140809 141800 140809 141800 2022-03-27
USD 146116 146116 146116 146116 2022-04-03
CodePudding user response:
You cannot use FIRST_VALUE
directly here because it is a window function, not an aggregate function.
You need to embed it in a subquery/derived table and use aggregation over it.
Also, it needs a PARTITION BY
clause, as well as ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
SELECT
cih.Code
,cih.Date
,MIN(cih.OpenPerDay) AS [Open]
,MAX(cih.[Close]) AS High
,MIN(cih.[Close]) AS Low
,MIN(cih.ClosePerDay) AS [Close]
FROM (
SELECT
*
,CAST(cih.Time AS date) AS Date
,FIRST_VALUE(cih.[Close]) OVER (PARTITION BY cih.Code, CAST(cih.Time AS date)
ORDER BY cih.Time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS OpenPerDay
,LAST_VALUE(cih.[Close]) OVER (PARTITION BY cih.Code, CAST(cih.Time AS date)
ORDER BY cih.Time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS ClosePerDay
FROM CurrencyIntradayHistories cih
) cih
GROUP BY
cih.Code,
cih.Date
ORDER BY
cih.Date;
A slightly more efficient version uses ROW_NUMBER
and LEAD
to find the starting and ending rows.
SELECT
cih.Code
,cih.Date
,MIN(CASE WHEN cih.rn = 1 THEN cih.[Close] END) AS [Open]
,MAX(cih.[Close]) AS High
,MIN(cih.[Close]) AS Low
,MIN(CASE WHEN cih.NextClose IS NULL THEN cih.[Close] END) AS [Close]
FROM (
SELECT
*
,CAST(cih.Time AS date) AS Date
,ROW_NUMBER() OVER (PARTITION BY cih.Code, CAST(cih.Time AS date) ORDER BY cih.Time) AS rn
,LEAD(cih.[Close]) OVER (PARTITION BY cih.Code, CAST(cih.Time AS date)
ORDER BY cih.Time) AS NextClose
FROM CurrencyIntradayHistories cih
) cih
GROUP BY
cih.Code,
cih.Date
ORDER BY
cih.Date;
CodePudding user response:
Use window functions MIN(), MAX() and FIRST_VALUE() on the whole dataset and not on the results that you get if you group by Code and date:
SELECT DISTINCT
[Code],
FIRST_VALUE([Close]) OVER (PARTITION BY [Code], CONVERT(date, [Time]) ORDER BY [Time] ASC) AS [Open],
MAX([Close]) OVER (PARTITION BY [Code], CONVERT(date, [Time])) AS [High],
MIN([Close]) OVER (PARTITION BY [Code], CONVERT(date, [Time])) AS [Low],
FIRST_VALUE([Close]) OVER (PARTITION BY [Code], CONVERT(date, [Time]) ORDER BY [Time] DESC) AS [Close],
CONVERT(date, [Time]) [Time]
FROM [CurrencyIntradayHistories];
See the demo.