Home > Software design >  How to optimize below my SQL query shown here
How to optimize below my SQL query shown here

Time:08-04

This query is written for those users who did not log-in to the system between 1st July to 31 July.

However when we run the query in query analyzer then it's taking more than 2 minutes. But in application side giving error as 'Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding'.

Below query takes start date as 1st July 2022 and get all the users and add those users into temp table called '@TABLE_TEMP' and increases to next date.

Again while loop runs and fetch users for 2nd July and so on until it reaches to 31st July.

Can anyone help on this to optimize the query using CTE or any other mechanism? H ow can we avoid While loop for better performance?

DECLARE @TABLE_TEMP TABLE
                    (
                        Row int IDENTITY(1,1),
                        [UserId] int,
                        [UserName] nvarchar(100),       
                        [StartDate] nvarchar(20),
                        [FirstLogin] nvarchar(20),
                        [LastLogout] nvarchar(20)       
                    )

DECLARE @START_DATE datetime = '2022-07-01';
DECLARE @END_DATE   datetime = '2022-07-31';
DECLARE @USER_ID nvarchar(max) = '1,2,3,4,5,6,7,8,9';
DECLARE @QUERY nvarchar(max) = '';

WHILE(@START_DATE < @END_DATE OR @START_DATE = @END_DATE)
BEGIN               
    SET @QUERY = 'SELECT 
                      s.userid AS [UserId], 
                      s.username AS [UserName],
                 '''   CAST(@START_DATE as nvarchar)    ''' AS [StartDate],
                      MAX(h.START_TIME) as [FirstLogin],
                      MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) as [LastLogout]                  
                  FROM USER s 
                  LEFT JOIN USER_LOGIN_HISTORY h ON h.userid = s.userid                                                         
                  LEFT JOIN TEMP_USER_INACTIVATION TUI ON TUI.userid = s.userid AND ('''  CAST(@START_DATE as nvarchar)   ''' BETWEEN ACTIVATED_DATE AND DEACTIVATD_DATE)
                  WHERE s.userid IN ('   @USER_ID   ') 
                    AND h.userid  NOT IN (SELECT userid FROM USER_LOGIN_HISTORY WHERE CAST(START_TIME AS DATE)  = '''  CONVERT(nvarchar,(CAST(@START_DATE AS DATE)))  ''')                                                                                      AND ACTIVATED_DATE IS NOT NULL 
                  GROUP BY s.userid, h.userid, s.username, s.last_seen_time
                  HAVING CAST(MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) AS DATE) <>  '''  CONVERT(nvarchar,(CAST(@START_DATE AS DATE)))    '''
                  ORDER BY [User Name]'

    INSERT INTO @TABLE_TEMP
        EXEC(@QUERY)   

    SET @START_DATE = DATEADD(DD, 1, @START_DATE)           
END

CodePudding user response:

Without the query plan, it's hard to say for sure.

But there are some clear efficiencies to be had.

Firstly, there is no need for a WHILE loop. Create a Dates table which has every single date in it. Then you can simply join it.

Furthermore, do not inject the @USER_ID values. Instead, pass them thorugh as a Table Valued Parameter. At the least, split what you have now into a temp table or table variable.

Do not cast values you want to join on. For example, to check if START_TIME falls on a certain date, you can do WHERE START_TIME >= BeginningOfDate AND START_TIME < BeginningOfNextDate.

The LEFT JOINs are suspicious, especially given you are filtering on those tables in the WHERE.

Use NOT EXISTS instead of NOT IN or you could get incorrect results

DECLARE @START_DATE date = '2022-07-01';
DECLARE @END_DATE   date = '2022-07-31';
DECLARE @USER_ID nvarchar(max) = '1,2,3,4,5,6,7,8,9';

DECLARE @userIds TABLE (userId int PRIMARY KEY);
INSERT @userIds (userId)
SELECT CAST(value AS int)
FROM STRING_SPLIT(@USER_ID, ',');


SELECT 
  s.userid as [UserId],
  s.username as [UserName],
  d.Date as [StartDate],
  MAX(h.START_TIME) as [FirstLogin],
  MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) as [LastLogout]
FROM Dates d
JOIN USER s 
    LEFT JOIN USER_LOGIN_HISTORY h ON h.userid = s.userid                                                       
    LEFT JOIN TEMP_USER_INACTIVATION TUI
        ON TUI.userid = s.userid
   ON d.Date BETWEEN ACTIVATED_DATE AND DEACTIVATD_DATE  -- specify table alias (don't know which?)
WHERE s.userid in (SELECT u.userId FROM @userIds u) 
  AND NOT EXISTS (SELECT 1
    FROM USER_LOGIN_HISTORY ulh
    WHERE ulh.START_TIME >= CAST(d.date AS datetime)
      AND ulh.START_TIME < CAST(DATEADD(day, 1, d.date) AS datetime)
      AND ulh.userid = h.userid
  )                                                                           
  AND ACTIVATED_DATE IS NOT NULL
  AND d.Date BETWEEN @START_DATE AND @END_DATE

GROUP BY
  d.Date,
  s.userid,
  s.username,
  s.last_seen_time
HAVING CAST(MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) AS DATE) <> d.date

ORDER BY   -- do you need this? remove if possible.
  s.username;

CodePudding user response:

Better to collect dates in a table rather than running query in a loop. Use following query to collect dates between given date range:

DECLARE @day INT= 1
DECLARE @dates TABLE(datDate DATE)
--creates dates table first and then create dates for the given month.
WHILE ISDATE('2022-8-'   CAST(@day AS VARCHAR)) = 1
    BEGIN
       INSERT INTO @dates
       VALUES      (DATEFROMPARTS(2022, 8, @day))
       SET @day = @day   1
END

Then to get all dates where user did not login, you have to use Cartesian join and left join as illustrated below

SELECT    allDates.userID, 
        allDates.userName, 
        allDates.datDate notLoggedOn
FROM
(
    --This will reutrun all users for all dates in a month i.e. 31 rows for august for every user
    SELECT *
    FROM   Users, 
         @dates 
) allDates
LEFT JOIN
(
    --now get last login date for every user between given date range
    SELECT userID, 
         MAX(login_date) last_Login_date
    FROM   USER_LOGIN_HISTORY
    WHERE  login_date BETWEEN '2022-08-01' AND '2022-08-31'
    GROUP BY userID
) loggedDates ON loggedDates.last_Login_date = allDates.datDate
WHERE loggedDates.last_Login_date IS NULL --filter out only those users who have not logged in
ORDER BY allDates.userID, 
        allDates.datDate

From this query you will get every day of month when a user did not logged in.

If there is no need to list every single date when user did not log in, then Cartesian join can be omitted. This will further improve the performance.

I hope this will help.

  • Related