Troubleshooting Query Plan Optimization Using Query Store

Introduction

SQL Server Query Store is enabled to assist in monitoring query performance and identifying query plan regressions. Query Store captures a history of query plans, making it an excellent tool for troubleshooting and optimizing query performance by selecting the best plan. This guide provides a detailed step-by-step approach to using Query Store to analyze, troubleshoot, and enforce an optimal query plan.

Part 1: Preliminary Assessment and Understanding Query Store

Step 1.1: Check Query Store Retention and Expiry Settings

Query to Verify Query Store Retention Period

It is important to understand how long Query Store retains data, as the retention period determines the duration for which historical plan information is available.

Query to Check Retention Period:

SELECT actual_state_desc, desired_state_desc, query_capture_mode_desc, max_storage_size_mb, stale_query_threshold_days

FROM sys.database_query_store_options;

Action: Verify stale_query_threshold_days to ensure that the retention period is suitable for your troubleshooting needs. Adjust if necessary to retain historical plans for a longer period.

Step 1: Understand the Query Store Structure

– Query Store Components

– Query Store maintains a record of query texts, execution plans, and performance statistics over time.

– Three Key Tables:

– sys.query_store_query: Stores information about queries.

– sys.query_store_plan: Stores the execution plans for each query.

– sys.query_store_runtime_stats: Stores runtime statistics for each plan, such as execution count and average duration.

– Ensure Query Store is Enabled

– Confirm Query Store is enabled and set to read-write mode.

– Query to Verify Query Store Status:

SELECT name, is_query_store_on

FROM sys.databases

WHERE name = ‘DatabaseName’;

– Action: Ensure is_query_store_on is set to 1.

Step 2: Identify the Problematic Query

– Query to Find High-Resource Queries

– Use Query Store to identify queries with high CPU usage, high I/O, or long duration.

– Query to Identify Top Resource-Consuming Queries:

SELECT TOP 5

qs.query_id,

qt.query_sql_text,

p.plan_id,

rs.avg_duration / 1000 AS AvgDuration_ms,

rs.avg_cpu_time / 1000 AS AvgCPUTime_ms,

rs.avg_logical_io_reads AS AvgLogicalReads

FROM sys.query_store_query_text qt

JOIN sys.query_store_query qs ON qt.query_text_id = qs.query_text_id

JOIN sys.query_store_plan p ON qs.query_id = p.query_id

JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id

ORDER BY rs.avg_duration DESC;

– Action: Identify the query_id and plan_id of the poorly performing query for further analysis.

Part 2: Analyzing Historical Query Plans

Step 3: View Historical Execution Plans

– Obtain Historical Plans

– Use the query_id from Step 2 to analyze all historical execution plans for the given query.

– Query to View All Plans for a Query:

SELECT

p.plan_id,

p.is_forced_plan,

rs.avg_duration / 1000 AS AvgDuration_ms,

rs.execution_type_desc,

p.query_plan

FROM sys.query_store_plan p

JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id

WHERE p.query_id = <query_id>

ORDER BY rs.avg_duration DESC;

– Action: Analyze the AvgDuration_ms and execution_count to identify which plan has the best performance.

Step 4: Compare Execution Plans

– Visualize and Compare Plans

– Copy the query_plan XML output and load it into SQL Server Management Studio (SSMS) to visualize the plan.

– Compare different plans to identify differences, such as join type changes, index usage, or parallelism.

– Key Elements to Look For:

– Index Scans vs. Seeks: Index seeks are generally more efficient than scans.

– Nested Loop vs. Hash Joins: Depending on the data size, Nested Loop may be preferable for smaller datasets, while Hash Joins are better for larger datasets.

– Parallelism: High CXPACKET waits may indicate excessive parallelism.

Part 3: Enforcing the Optimal Query Plan

Step 5: Identify the Best Plan to Force

– Select the Optimal Plan

– Based on the analysis in Step 4, choose the plan_id that shows the best average duration and resource usage.

– Force the Selected Plan

– Command to Force a Query Plan:

EXEC sp_query_store_force_plan @query_id = <query_id>, @plan_id = <plan_id>;

– Action: Forcing the optimal plan ensures that SQL Server consistently uses it, preventing performance regressions.

Step 6: Verify Plan Enforcement

-Find Forced Query Plans

Use the following query to identify all forced query plans in the Query Store:

SELECT
qs.query_id,
qt.query_sql_text,
p.plan_id,
p.is_forced_plan,
rs.avg_duration / 1000 AS AvgDuration_ms
FROM sys.query_store_plan p
JOIN sys.query_store_query qs ON p.query_id = qs.query_id
JOIN sys.query_store_query_text qt ON qs.query_text_id = qt.query_text_id
JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id
WHERE p.is_forced_plan = 1;

Action: Review and monitor all forced plans to ensure they are performing as expected.

Check if Plan is Successfully Forced

  – Query to Verify Forced Plan:

SELECT

p.plan_id,

p.is_forced_plan,

rs.avg_duration / 1000 AS AvgDuration_ms

FROM sys.query_store_plan p

JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id

WHERE p.query_id = <query_id>;

Action: Confirm that is_forced_plan is set to 1 for the chosen plan.

Part 4: Validation and Post-Optimization Monitoring

Step 7: Monitor Query Performance

– Track the Impact of the Forced Plan

  – Use Query Store to track the performance of the forced plan over time.

  – Query to Monitor Forced Plan Performance:

SELECT

p.plan_id,

rs.execution_count,

rs.avg_duration / 1000 AS AvgDuration_ms,

rs.last_execution_time

FROM sys.query_store_plan p

JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id

WHERE p.query_id = <query_id>

ORDER BY rs.last_execution_time DESC;

Action: Verify that the AvgDuration_ms has improved and that the execution times are consistent.

Step 8: Remove Plan Forcing if Needed

Unforce the Plan

– If the forced plan does not continue to perform well, you can unforce it to allow SQL Server to choose the best plan dynamically.

– Command to Unforce a Plan:

EXEC sp_query_store_unforce_plan @query_id = <query_id>, @plan_id = <plan_id>;

Action: Re-evaluate the query performance and consider further optimizations if required.

Step 9: Use the Query Store to Identify Missing Indexes (Additional Usage)

Query to Identify Missing Indexes using Query Store

WITH XMLNAMESPACES (DEFAULT ‘http://schemas.microsoft.com/sqlserver/2004/07/showplan’)

SELECT qsqt.query_sql_text,

rts.plan_id,

rts.NumExecutions,

rts.MinDuration,

rts.MaxDuration,

rts.AvgDuration,

rts.AvgReads,

rts.AvgWrites,

qsp.QueryPlan,

qsp.QueryPlan.value(N'(//MissingIndex/@Table)[1]’,

‘NVARCHAR(256)’) AS TableName,

qsp.QueryPlan.value(N'(//MissingIndex/@Schema)[1]’,

‘NVARCHAR(256)’) AS SchemaName,

qsp.QueryPlan.value(N'(//MissingIndexGroup/@Impact)[1]’,

‘DECIMAL(6,4)’) AS ProjectedImpact,

ColumnGroup.value(‘./@Usage’,

‘NVARCHAR(256)’) AS ColumnGroupUsage,

ColumnGroupColumn.value(‘./@Name’,

‘NVARCHAR(256)’) AS ColumnName

FROM sys.query_store_query AS qsq

JOIN sys.query_store_query_text AS qsqt

ON qsqt.query_text_id = qsq.query_text_id

JOIN (   SELECT query_id,

CAST(query_plan AS XML) AS QueryPlan,

plan_id

FROM sys.query_store_plan) AS qsp

ON qsp.query_id = qsq.query_id

JOIN (   SELECT qsrs.plan_id,

SUM(qsrs.count_executions) AS NumExecutions,

MIN(qsrs.min_duration) AS MinDuration,

MAX(qsrs.max_duration) AS MaxDuration,

AVG(qsrs.avg_duration) AS AvgDuration,

AVG(qsrs.avg_logical_io_reads) AS AvgReads,

AVG(qsrs.avg_logical_io_writes) AS AvgWrites

FROM sys.query_store_runtime_stats AS qsrs

GROUP BY qsrs.plan_id) AS rts

ON rts.plan_id = qsp.plan_id

CROSS APPLY qsp.QueryPlan.nodes(‘//MissingIndexes/MissingIndexGroup/MissingIndex/ColumnGroup’) AS t1(ColumnGroup)

CROSS APPLY t1.ColumnGroup.nodes(‘./Column’) AS t2(ColumnGroupColumn);

Action: Evaluate the data with respect to the query plan and create the index needed.

Step 9: Set Up Alerts for Plan Regressions

– Configure Alerts

– Set up SQL Server Agent Alerts to notify the DBA team if a query plan regression occurs or if a forced plan starts to degrade in performance.

– Use tools like Extended Events to monitor for plan changes and alert when changes happen that could impact performance.

Conclusion

Using Query Store to troubleshoot query plan optimization allows for better control over query performance by selecting and enforcing the best execution plan. By following these steps, you can leverage Query Store to identify problematic queries, analyze historical plans, and ensure consistent performance for critical workloads.

 

Recent Posts