Contact Us

In high-performance SQL Server environments, how you "slice" your CPU resources is just as important as how many cores you have. We recently tackled a case where a customer was plagued by high SOS_SCHEDULER_YIELD and CXPACKET waits. The solution wasn't adding more power... it was restoring balance.

The Symptom: Heavy Waits and Uneven Load

The customer reported a sluggish system where CPU waits weren't just high... they were inconsistent. Upon investigation, we noticed that some nodes were working six times harder than others. This imbalance was causing a massive surge in:

  • SOS_SCHEDULER_YIELD: Signalling that tasks were being forced to give up the CPU because they couldn't finish their quantum.
  • CXPACKET: Indicating that parallel threads were stuck waiting for their "unbalanced" counterparts to catch up.

The Investigation: The "Imperfect" VM Slice

The physical host was a powerhouse with 64 CPUs (arranged in a 2 x 32 layout). However, the Virtual Machine (VM) was allocated 60 CPUs, presumably to leave 4 cores for the hypervisor.

While this seemed logical for the hypervisor, it created a "math problem" for SQL Server’s Soft-NUMA:

  • The VM presented 4 nodes of 15 CPUs.
  • SQL Server’s Soft-NUMA tried to further optimize this, resulting in 8 nodes with an uneven split: alternating between 8 and 7 CPUs per node.

This slight asymmetry having nodes of different sizes created a "scheduling friction" where the SQL OS scheduler couldn't distribute work evenly, leading to the massive disparity in wait times across nodes.

 

The Fix: Simplicity Over Complexity

Since the underlying VM configuration was already segmented into 4 nodes, we decided that the additional layer of Soft-NUMA was doing more harm than good. We recommended a return to a simpler, symmetrical topology.

We disabled Soft-NUMA using the following command:

ALTER SERVER CONFIGURATION SET SOFTNUMA = OFF;

GO

The Results: Stability in Symmetry

After restarting the service to apply the change, the 60 CPUs were reorganized into 4 clean, identical nodes of 15 CPUs each. The impact was immediate:

  • SOS_SCHEDULER_YIELD: Dropped by over 40%.
  • CXPACKET: Decreased by approximately 12%.
  • Load Distribution: The "6x difference" between nodes vanished, replaced by a balanced, even distribution of work.

 

Key Takeaway

More features aren't always better. Soft-NUMA is a powerful tool for modern high-core CPUs, but if your VM vCPU count doesn't divide cleanly into symmetrical nodes, it can create "ghost" bottlenecks. When in doubt, check your node alignment. Symmetry is often the key to sub-millisecond performance.

 

 

More tips and tricks

SMT 1.2 version is out
by Michal Tinthofer on 09/02/2021

Releasing SMT 1.2 to you right now! Check the details.

Read more
SMT 00.5.32 Released!
by Michal Tinthofer on 31/08/2017

After some time, we have finally released big set of changes and fixes for currently known issues. Most visible is new Waiting Task report for operational analysis, added timeline button for Index Usage & recommendation, new functionality to search for an

Read more
More than minor amount of changes released.
by Jiri Dolezalek on 24/09/2022

New version has been released and more features than we planned initially made it in.

Read more