Overview:
When SQL Server runs out of worker threads, it can cause performance degradation, query delays, and even login failures. Understanding the root cause and knowing how to respond quickly is critical for database administrators (DBAs) to keep systems running smoothly.
Purpose
This guide provides key steps and essential background to help troubleshoot and resolve issues when SQL Server runs out of worker threads for processing session requests.
Key Concepts
Process
A process is an executing program. The operating system uses processes to isolate applications from one another.
Thread
A thread is the smallest unit of execution to which an operating system allocates CPU time. Multiple threads can run within a single process.
Thread Pool
A collection of worker threads that execute tasks, callbacks, or requests on behalf of the application. SQL Server uses thread pools to manage work efficiently.
Thread Pool Wait
This occurs when there are no free threads available in SQL Server’s thread pool, causing queries to be delayed or login attempts to fail.
Why Worker Thread Exhaustion Happens
Worker thread exhaustion typically occurs under high concurrency or when long-running queries consume available threads.
DBA Responsibilities:
Proactively Monitor worker thread usage through monitoring tools such as SQL Server DMVs, or custom scripts.
Respond Quickly when thresholds are breached.
Investigate Root Causes — high concurrency, blocking, or slow queries.
Take Corrective Action — tune queries, add indexes, adjust worker thread settings if necessary.
Monitoring Thread Pool Waits and Thread Availability
Use the queries below to check if SQL Server is suffering from worker threads exhaustion.
SELECT COUNT(*)
FROM sys.dm_os_waiting_tasks
WHERE wait_type = ‘threadpool’;
SELECT *
FROM sys.dm_os_waiting_tasks
WHERE wait_type = ‘threadpool’;
You can also utilize the following query to monitor elevated wait times for worker threads using our in-house performance monitoring tools.
SELECT p.host,
p.collect_DT,
p.SQL_instance_processor,
p.total_host_Processor,
p.wait_for_the_worker_avg_wait_time_ms,
p.Network_IO_waits_avg_wait_time_ms
FROM admindb.perf.instance p
WHERE p.collect_DT BETWEEN ‘2023-12-20’ AND ‘2023-12-21’ –modify the dates accordingly
AND wait_for_the_worker_avg_wait_time_ms > 0
ORDER BY collect_DT DESC;
Execute the following query to assess thread availability and other related metrics.
SELECT (SELECT max_workers_count FROM sys.dm_os_sys_info) AS ‘TotalThreads’,
SUM(active_Workers_count) AS ‘CurrentThreads’,
(SELECT max_workers_count FROM sys.dm_os_sys_info) – SUM(active_Workers_count) AS ‘AvailableThreads’,
SUM(runnable_tasks_count) AS ‘WorkersWaitingForCPU’,
SUM(work_queue_count) AS ‘RequestWaitingForThreads’,
SUM(current_workers_count) AS ‘CurrentWorkers’
FROM sys.dm_os_Schedulers
WHERE status = ‘VISIBLE ONLINE’;
Plan and Implement Strategies to Optimize Thread Pool Usage
Work with application teams to identify and optimize resource intensive queries.
Execute the following query to identify the top ten most resource-intensive queries ranked by total thread count. Analyze their execution plans to address the requirements for new indexes or optimize inefficient codes.
SELECT TOP (10) *
FROM [admindb].[perf].[v_query_stats_statement_and_plan]
WHERE Ranked_by = ‘TC’
AND ranking <= 10
ORDER BY last_execution_time DESC;
Increase hardware resources
In cases where the demand consistently exceeds the existing thread pool capacity, you may want to consider scaling up hardware resources by adding more CPU cores or increasing memory.
Conclusion:
Worker thread exhaustion can significantly degrade SQL Server performance and even prevent new connections. By monitoring thread pool usage, analyzing high-concurrency workloads, and tuning resource-heavy queries, DBAs can prevent bottlenecks before they impact users. The strategies discussed — from query optimization to scaling hardware — provide a roadmap for maintaining thread availability and ensuring SQL Server remains responsive under peak load. Proactive detection and informed corrective actions transform a potential crisis into an opportunity to strengthen system performance and reliability.