Home > Net >  How do I effectively diagnose a 'wait operation timeout' on SQL Server 2012 database inser
How do I effectively diagnose a 'wait operation timeout' on SQL Server 2012 database inser

Time:04-23

I have a C# windows service that is writing to a SQL Server 2012 database with an exec sp_executesql call.

exec sp_executesql N'Insert into [table name redacted] 
(trial.[field redacted],trial.[field redacted],trial.[Duration],trial.[Count],trial.[Step],trial.[Target],trial.[ResultData],trial.[SessionDateTime],trial.[field redacted],trial.[ModifiedOn],trial.[CreatedOn],trial.[field redacted],trial.[SerialNumber],trial.[Category],trial.[CreatedBy],trial.[ModifiedBy],trial.[Status],trial.[IsEnabled],trial.[IsImmutable],trial.[IsHidden],trial.[Ordinal],trial.[IconUrl],trial.[Url],trial.[DataName],trial.[DisplayName],trial.[Description]) 
values (@field redacted_0,@field redacted_0,@Duration_0,@Count_0,@Step_0,@Target_0,@ResultData_0,@SessionDateTime_0,@field redacted_0,@ModifiedOn_0,@CreatedOn_0,@field redacted_0,@SerialNumber_0,@Category_0,@CreatedBy_0,@ModifiedBy_0,@Status_0,@IsEnabled_0,@IsImmutable_0,@IsHidden_0,@Ordinal_0,@IconUrl_0,@Url_0,@DataName_0,@DisplayName_0,@Description_0)',
N'@field redacted_0 uniqueidentifier,@field redacted_0 nvarchar(4000),@Duration_0 bigint,@Count_0 int,@Step_0 nvarchar(4000),@Target_0 nvarchar(4000),@ResultData_0 nvarchar(4000),@SessionDateTime_0 datetime,@field redacted_0 uniqueidentifier,@ModifiedOn_0 datetime,@CreatedOn_0 datetime,@field redacted_0 uniqueidentifier,@SerialNumber_0 nvarchar(8),@Category_0 nvarchar(4000),@CreatedBy_0 nvarchar(11),@ModifiedBy_0 nvarchar(11),@Status_0 nvarchar(6),@IsEnabled_0 bit,@IsImmutable_0 bit,@IsHidden_0 bit,@Ordinal_0 int,@IconUrl_0 nvarchar(4000),@Url_0 nvarchar(4000),@DataName_0 nvarchar(4000),@DisplayName_0 nvarchar(12),@Description_0 nvarchar(4000)',@field redacted_0='BB52C791-28BC-EC11-BE10-E884A50CE990',@field redacted_0=NULL,@Duration_0=0,@Count_0=0,@Step_0=NULL,@Target_0=N'',
@ResultData_0=NULL,@SessionDateTime_0='2022-04-19 17:57:23',@field redacted_0='F626F234-0AC0-EC11-BE11-E884A50CE990',@ModifiedOn_0='2022-04-19 17:59:15.590',@CreatedOn_0='2022-04-19 17:59:15.590',@field redacted_0='EEFB196C-0AC0-EC11-BE11-E884A50CE990',@SerialNumber_0=N'00000057',@Category_0=NULL,@CreatedBy_0=N'John Stamos',@ModifiedBy_0=N'John Stamos',@Status_0=N'Normal',@IsEnabled_0=1,@IsImmutable_0=0,@IsHidden_0=0,@Ordinal_0=0,@IconUrl_0=NULL,@Url_0=NULL,@DataName_0=NULL,@DisplayName_0=N'Mobile Trial',@Description_0=NULL

Normally, the insert is almost instantaneous. Sometimes, maybe 1 out of 1000 times, it results in an error

Win32Exception: The wait operation timed out

I'm not fluent with SQL Server database administration in the slightest and my searches have led me to many read fixes including exec sp_updatestats and with (NOLOCK) fixes, but I'm not sure how to address this when dealing with an Insert or even how to figure out how I'm getting into this state. We are not currently using transactions at all (intentionally) so there should be minimal locking. We have 4.8 million rows in the table being inserted into and it does have a foreign key to another table with only 116K rows.

Here is the table schema. I'm looking for some help on how to identify what is causing this issue. This is only happening in production and intermittently of course, so troubleshooting has been difficult.

table schema

Thanks in advance.

UPDATE: based on @Charlieface's recommendation, I ran the query when experiencing a timeout and got these results.

Wait Query Results

How do I use this information to find out what is causing the locking? I'm guessing the Waiting Command is significant, but that's my only guess so far.

UPDATE 2: This is the SQLException (as opposed to the Win32Exception)

2022-04-20 16:00:47,861 [4] ERROR Net.ExceptionMarshallingErrorHandler - System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action'1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData() at System.Data.SqlClient.SqlDataReader.get_MetaData() at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method) at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method) at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior)

CodePudding user response:

As you can see from the results of the blocking chain script, the head of the blocking chain has a status of AWAITING COMMAND. This means that it likely has an open transaction which has been left hanging, while holding locks. Until the server decides to drop the connection, all other transactions are blocked on those rows or objects that are locked. This is causing a timeout. The solution is not to increase the command or lock timeouts, as this does not deal with the root cause.

I suggest you take a close look at the code which fires that command. There are two ways it may be using transactions:

  1. Using a server-side BEGIN TRAN and COMMIT statement, within the same batch or procedure.

    In the event of a batch-aborting error, the transaction is left hanging until the connection is closed or reset.
    But due to the way ADO.Net manages the connection pool, this does not happen until the connection is reused or a 4-minute timeout expires. (This is irrespective of the use of using on the connection object.)

  2. Using a client-side transaction with SqlTransaction.BeginTransaction.

    In the event of an exception, a using statement will normally ensure successful rollback and release of locks.
    But this sometimes does not happen successfully, for example if the client loses network connection.

The solution to both of these is:

  • To always use a using statement on the connection, transaction, command and reader objects. This gives best-effort rollback in the event of an exception on the client.
  • Ensure that XACT_ABORT is set to ON. This means that the server will always rollback in the event of an error.
    You can do this as part of the batch or procedure, or better: set it as the default server-wide.

Using a server-side transaction is generally better than a client-side transaction, as in the event of the client losing network connection, the server will not know to rollback.

  • Related