I have a ten million level database. The client needs to read data and perform calculation. Due to the large amount of data, if it is saved in the application cache, memory will be overflow and crash will occur. If I use select statement to query data from the database in real time, the time may be too long and the number of operations on the database may be too frequent. Is there a better way to read the database data? I use C and C# to access SQL Server database.
My database statement is similar to the following:
SELECT TOP 10 y.SourceName, MAX(y.EndTimeStamp - y.StartTimeStamp) AS ProcessTimeStamp
FROM
(
SELECT x.SourceName, x.StartTimeStamp, IIF(x.EndTimeStamp IS NOT NULL, x.EndTimeStamp, 134165256277210658) AS EndTimeStamp
FROM
(
SELECT
SourceName,
Active,
LEAD(Active) OVER(PARTITION BY SourceName ORDER BY TicksTimeStamp) NextActive,
TicksTimeStamp AS StartTimeStamp,
LEAD(TicksTimeStamp) OVER(PARTITION BY SourceName ORDER BY TicksTimeStamp) EndTimeStamp
FROM Table1
WHERE Path = N'App1' and TicksTimeStamp >= 132165256277210658 and TicksTimeStamp < 134165256277210658
) x
WHERE (x.Active = 1 and x.NextActive = 0) OR (x.Active = 1 and x.NextActive = null)
) y
GROUP BY y.SourceName
ORDER BY ProcessTimeStamp DESC, y.SourceName
The database structure is roughly as follows:
ID Path SourceName TicksTimeStamp Active
1 App1 Pipe1 132165256277210658 1
2 App1 Pipe1 132165256297210658 0
3 App1 Pipe1 132165956277210658 1
4 App2 Pipe2 132165956277210658 1
5 App2 Pipe2 132165956277210658 0
CodePudding user response:
I agree with @Charlieface. I think the index you want is as follows:
CREATE INDEX idx ON Table1 (Path, TicksTimeStamp) INCLUDE (SourceName, Active);
You can add both indexes (with different names of course) and see which one the execution engine chooses.
CodePudding user response:
I can suggest adding the following index which should help the inner query using LEAD
:
CREATE INDEX idx ON Table1 (SourceName, TicksTimeStamp, Path) INCLUDE (Active);
The key point of the above index is that it should allow the lead values to be rapidly computed. It also has an INCLUDE
clause for Active
, to cover the entire select.
CodePudding user response:
CREATE NONCLUSTERED INDEX index11 ON dbo.Table1 ([Path],[TicksTimeStamp]) INCLUDE ([SourceName],[Active])
I added the index to the database which has 5 million data, but the query speed is almost the same as that without index.
Besides improving the execution efficiency of SQL statements, are there other ways to improve the query speed? Such as fetching SQL data to the cache and updating the cache periodically? Since caching to the application may cause crash, are there other methods?