Contact Us

In high-performance SQL Server environments, how you "slice" your CPU resources is just as important as how many cores you have. We recently tackled a case where a customer was plagued by high SOS_SCHEDULER_YIELD and CXPACKET waits. The solution wasn't adding more power... it was restoring balance.

The Symptom: Heavy Waits and Uneven Load

The customer reported a sluggish system where CPU waits weren't just high... they were inconsistent. Upon investigation, we noticed that some nodes were working six times harder than others. This imbalance was causing a massive surge in:

  • SOS_SCHEDULER_YIELD: Signalling that tasks were being forced to give up the CPU because they couldn't finish their quantum.
  • CXPACKET: Indicating that parallel threads were stuck waiting for their "unbalanced" counterparts to catch up.

The Investigation: The "Imperfect" VM Slice

The physical host was a powerhouse with 64 CPUs (arranged in a 2 x 32 layout). However, the Virtual Machine (VM) was allocated 60 CPUs, presumably to leave 4 cores for the hypervisor.

While this seemed logical for the hypervisor, it created a "math problem" for SQL Server’s Soft-NUMA:

  • The VM presented 4 nodes of 15 CPUs.
  • SQL Server’s Soft-NUMA tried to further optimize this, resulting in 8 nodes with an uneven split: alternating between 8 and 7 CPUs per node.

This slight asymmetry having nodes of different sizes created a "scheduling friction" where the SQL OS scheduler couldn't distribute work evenly, leading to the massive disparity in wait times across nodes.

 

The Fix: Simplicity Over Complexity

Since the underlying VM configuration was already segmented into 4 nodes, we decided that the additional layer of Soft-NUMA was doing more harm than good. We recommended a return to a simpler, symmetrical topology.

We disabled Soft-NUMA using the following command:

ALTER SERVER CONFIGURATION SET SOFTNUMA = OFF;

GO

The Results: Stability in Symmetry

After restarting the service to apply the change, the 60 CPUs were reorganized into 4 clean, identical nodes of 15 CPUs each. The impact was immediate:

  • SOS_SCHEDULER_YIELD: Dropped by over 40%.
  • CXPACKET: Decreased by approximately 12%.
  • Load Distribution: The "6x difference" between nodes vanished, replaced by a balanced, even distribution of work.

 

Key Takeaway

More features aren't always better. Soft-NUMA is a powerful tool for modern high-core CPUs, but if your VM vCPU count doesn't divide cleanly into symmetrical nodes, it can create "ghost" bottlenecks. When in doubt, check your node alignment. Symmetry is often the key to sub-millisecond performance.

 

 

More tips and tricks

SMT 1.9 is ready
by Jiri Dolezalek on 12/05/2023

What to expect from new SMT version?

Read more
SMT 1.12 Introduces more features for tuning
by Michal Tinthofer on 14/04/2025

We are excited to announce the release of SMT 1.12.0. In this update, we have focused on refining existing features while introducing a new optional capability that could dramatically enhance the way you proactively tune your SQL Server with SMT.

Read more
Load Factor – Uneven load distribution
by Michal Kovaľ on 15/06/2022

Recently, we found an interesting pattern during exploring one of our SMT graphs while doing a health check of a SQL server. The following graph shows us the Load Factor attribute. The value came from system table sys.dm_os_schedulers, which the SMT tool

Read more