Home > database >  MSSQL Query Optimization Without Expensive Join
MSSQL Query Optimization Without Expensive Join

Time:12-08

We are developing a query to try and find all messages in a table that have their most recent Message Event be in Error. The messages can be re-processed and that has the potential to put them in a good state which is why we are looking for the most recent even to be the one in error. We are not strong with our SQL and the query we are currently running takes much longer than we would like to run. Is there anything outside of adding indexes or shrinking the table size that could help this query run faster? Can this be done without a join?

SELECT M.MessageID AS MessageID, 
       Source, 
       IntakeTimestamp, 
       MessageKey, 
       MessageSourceID, 
       CustomerID, 
       AgreementID, 
       LocationID, 
       EquipmentID, 
       OfficeID, 
       IncidentID, 
       M.HistoryID AS HistoryID, 
       Subject, 
       FromHeader, 
       ToHeader, 
       PackedHeaders 
FROM [pipe].[Message] AS M 
  JOIN [pipe].[MessageState] AS MS ON MS.MessageID = M.MessageID 
WHERE MessageStateID IN ( 
  SELECT TOP 1 MessageStateID FROM [pipe].[MessageState] AS NMS 
  WHERE M.MessageID = NMS.MessageID 
  AND MS.EventStatusID = 6 
  ORDER BY MessageStateID DESC ) 
ORDER BY [MessageID] DESC

I was able to pull all the messages in the error state with the simple query below but this didn't check the most recent message state only

SELECT * FROM [pipe].[Message] 
WHERE MessageID IN (
  SELECT *
  FROM [pipe].[MessageState] 
  WHERE EventStatusID = 6
  )

CodePudding user response:

You can try something like the below to find the most recent record for each MessageID, assuming that your MessageStateID ordering is appropriate for your data. This works by assigning each group of rows with the same MessageID value a row number that is sorted in descending order by the MessageStateID value. In doing this, any record with a rn value of 1 is the most recent record for that MessageID:

WITH m AS
(
    SELECT M.MessageID, 
           Source, 
           IntakeTimestamp, 
           MessageKey, 
           MessageSourceID, 
           CustomerID, 
           AgreementID, 
           LocationID, 
           EquipmentID, 
           OfficeID, 
           IncidentID, 
           M.HistoryID AS HistoryID, 
           Subject, 
           FromHeader, 
           ToHeader, 
           PackedHeaders,
           MS.EventStatusID,
           ROW_NUMBER() OVER (PARTITION BY M.MessageID ORDER BY M.MessageStateID DESC) AS rn
    FROM [pipe].[Message] AS M 
      JOIN [pipe].[MessageState] AS MS
        ON MS.MessageID = M.MessageID
)
SELECT m.*
FROM m
WHERE m.rn = 1
    AND m.EventStatusID = 6
ORDER BY m.MessageID DESC

I would also recommend that you properly alias all your columns so you know which table they are coming from.

  • Related