SQL Server 2019 instance seems to randomly dump its query plan cache-CodePudding

The instance in question has maximum server memory set to 6GB, but only seems to be using half a GB. I checked the query plan cache by using the query on this page:

https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-exec-cached-plans-transact-sql?view=sql-server-ver16

SELECT usecounts, cacheobjtype, objtype, text   
FROM sys.dm_exec_cached_plans   
CROSS APPLY sys.dm_exec_sql_text(plan_handle)   
WHERE usecounts > 1   
ORDER BY usecounts DESC;  
GO

After running that, I only see about 3 plans. When I run the application that uses this database, sometimes there will be 300-400 plans, but about 30 seconds later the same query will only show about 3 plans in the cache.

I've run SQL profiler and can't find anything running a DBCC FREEPROCCACHE

There are 3 other instances on this server that are consuming their allocated memory just fine. One in particular is allowed to eat 2GB and has consumed the entire amount with over 500 plans consistently in its cache.

Other than a scheduled task running DBCC FREEPROCCACHE every 30-60 seconds, is there anything that would cause SQL Server 2019 to behave in this way?

CodePudding user response：

Multiple facets of SQL Server will 'compete' for buffer cache, including:

Data
Plans
Clerks (i.e., other caches)
Memory Grants
etc

The amount of space that Plans can consume is dictated by thresholds defined here:

And, once plans start to exceed those thresholds, the SQLOS will beging to 'eagerly cleanup/clip/evict' less frequently used plans.

Likewise, if OTHER clerks (caches for things like schemas, objects, and permissions-caches against those objects - i.e., TOKENPERMS) exceed certain, internal, cache thresholds they TOO can cause the SQLOS to start scavenging ALL caches - including cache plans.

For example:

https://docs.microsoft.com/en-us/archive/blogs/psssql/query-performance-issues-associated-with-a-large-sized-security-cache

Likewise, Memory Grants can/will use buffer cache during query processing. For example, if you're querying a huge table and the engine expects to get back (or hang-on-to for further processing) roughly 1KB of for each of 10 million rows, you're going to need potentially 9GB of buffer space for said query to process. (Or, there are mechanics LIKE this in play with memory grants - the example I've cited is WAY too simplistic - to the point of not being even close to accurate). The point being, however, that these grants can/will be given RAM directly from the overall buffer cache and can/will cause INTERNAL memory pressure against the plan-cache (and all other caches for that matter). In short, memory grants can be a huge problem with SOME workloads.

Otherwise, external factors (other apps - especially memory-hungry apps) can/will cause the OS to tell SQL Server to 'cough up' memory it has been using. (You can prevent this by granting the Lock_Pages_In_Memory User Right to the SQL Server service account - just be sure you know what you're doing here.)

In your case, with 4x distinct instances running, I'd assume you're likely running into 'external' memory pressure against the instance in question.

That said, you can query sys.dm_os_ring_buffers to get insight into whether or not memory pressure is happening - as per posts like the following:

Along those lines, I use the following query/diagnostic to check for memory pressure:


WITH core AS ( 
    SELECT 
        EventTime,
        record.value('(/Record/ResourceMonitor/Notification)[1]', 'varchar(max)') as [Type],
        record.value('(/Record/ResourceMonitor/IndicatorsProcess)[1]', 'int') as [IndicatorsProcess],
        record.value('(/Record/ResourceMonitor/IndicatorsSystem)[1]', 'int') as [IndicatorsSystem],
        record.value('(/Record/ResourceMonitor/IndicatorsPool)[1]', 'int') as [IndicatorsPool], 
        record.value('(/Record/MemoryNode/@id)[1]', 'int') as [MemoryNode],
        record.value('(/Record/MemoryRecord/AvailablePhysicalMemory)[1]', 'bigint') AS [Avail Phys Mem, Kb],
        record.value('(/Record/MemoryRecord/AvailableVirtualAddressSpace)[1]', 'bigint') AS [Avail VAS, Kb], 
        record
    FROM (
        SELECT
            DATEADD (ss, (-1 * ((cpu_ticks / CONVERT (float, ( cpu_ticks / ms_ticks ))) - [timestamp])/1000), GETDATE()) AS EventTime,
            CONVERT (xml, record) AS record
        FROM sys.dm_os_ring_buffers
        CROSS JOIN sys.dm_os_sys_info
        WHERE ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR') AS tab
)

SELECT 
    EventTime,
    [Type],
    IndicatorsProcess,
    IndicatorsSystem,
    IndicatorsPool,
    MemoryNode,
    CAST([Avail Phys Mem, Kb] / (1024.0 * 1024.0) AS decimal(20,2)) [Avail Phys Mem (GB)],
    CAST([Avail VAS, Kb] / (1024.0 * 1024.0) AS decimal(20,2)) [Avail VAS (GB)]
    ,record
FROM 
    core
WHERE 
    [Type] = N'RESOURCE_MEMPHYSICAL_LOW'
ORDER BY 
    EventTime DESC;

As in, if you run that against effectively ANY SQL Server instance, you REALLY don't want to see ANY results from this query. Or, if you do, they should be at times when you're running REALLY heavy workloads (ugly data-loading/population jobs or other huge processing operations) that you're already aware are issues/problems from a performance perspective.

Otherwise, the occasional entry/hiccup (i.e., set of results) isn't necessarily a reason to worry about major problems, but if you're routinely seeing entries/rows/results from the above with regular workloads, you'll want to investigate things like all of the details listed above (cache and clerk sizes/thresholds, trap for any large memory grants, check plan-cache sizing based on overall RAM, etc.) AND/OR start looking into cache clock hands to see exactly where memory is being scavenged:

https://docs.microsoft.com/en-us/archive/blogs/slavao/q-and-a-clock-hands-what-are-they-for