Contact Us

In the world of database administration, index maintenance is a necessary evil. Done right, it keeps your queries snappy; done wrong, it becomes a resource-hungry monster that eats your maintenance window alive. Recently, we faced a challenge where our SQL Agent job wasn't just hungry... it was failing consistently, with runtimes occasionally exceeding 15 hours.

Here is the story of how we moved from a crumbling 15-hour process to a streamlined, stable system.

 

Phase 1: Divide and Conquer

Our first instinct was to handle the sheer volume of data by splitting the workload. We identified the two largest tables in the database and gave them their own dedicated job. Everything else was handled by a second job.

  • Result: The total duration was split, but each job still took roughly 8 hours.
  • The Problem: While the load was distributed, the stability didn't improve. We were still seeing frequent crashes caused by locks and deadlocks with other server processes.

Phase 2: Identifying the Conflict Zones

Splitting by size wasn't enough; we had to split by "behavior." By analyzing the logs, we identified specific objects that were frequently involved in deadlocks.

We created a third job specifically for these problematic objects and shifted its execution to a different time slot.

  • Result: This significantly increased stability and consistency. The jobs were finally finishing without crashing.
  • The Remaining Hurdle: The runtime was still a massive 6 hours per job. We had solved the "if" it finishes, but not the "how long" it takes.

 

Phase 3: The "Aha!" Moment – Hunting the Initialization Lag

When a job takes 6 hours, you usually look at the data movement. However, we dug deeper and found something shocking: The maintenance initialization was taking 4–5 hours before a single index was even touched.

Because the database contained a massive number of tables, the standard scripts were spending hours just "thinking" about what to do.

Phase 4: Optimizing the Best in the Business

Most of us rely on Ola Hallengren’s legendary dbo.IndexOptimize procedure. It is the industry standard for a reason, but even the best tools need a tune-up for extreme edge cases.

To handle our specific "huge table count" scenario, we modified the procedure to create a specialized version: IndexOptimize_OptimizedForLargeDB.

The impact was immediate and dramatic:

  • Initialization Time: Reduced from 5 hours to just 5 minutes.
  • Total Job Duration: Shortened by several hours, finally fitting comfortably into our maintenance window.

 

Key Takeaways for DBAs

  1. Don't just look at the work; look at the prep: If your jobs are slow, trace them from the very first second. You might be losing hours to metadata overhead.
  2. Logs are a roadmap: Identifying specific objects for deadlocking allowed us to schedule around the conflict rather than just throwing more hardware at it.
  3. Customize when necessary: Even world-class scripts can be optimized for your specific environment.

By moving from a "one-size-fits-all" approach to a specialized, optimized procedure, we turned a failing 15-hour marathon into a stable, high-speed sprint which doesn’t take longer than an hour.

 

 

More tips and tricks

SMT 1.0 development finished
by Michal Tinthofer on 30/10/2020

We just finished second major SMT development in 2020

Read more
SMT 1.10.1 released
by Michal Tinthofer on 16/01/2024

SMT 1.10.1 changelog

Read more
ShowIT2019
by Michal Tinthofer on 08/02/2019

I would like to say thank you for attending my presentations on this year ShowIT.

Read more