Query Analysis

If you’ve read my previous posts you know I periodically run a system query to identify the queries that are presenting the heaviest load for the SQL server. Today, the query on the top was one that I wrote a few months ago related to a dashboard report. Here’s the query:

When I displayed the query execution plan, I got two nodes that were worth talking about:

The second one is from a subquery where I union of the finished items table and the wip table that gets the total production. We’re going to focus on the first one, that’s 59%. This is related to the center of the query, where we figure out how long a job has run.

The TemProductionTime table has a row for each timestamp every time a job’s active status is modified. In order to determine how long a job was run, you have to look at two rows – the first will have a status of ‘Run’ and will have the start date, whereas the next one with the same job number will contain the Run’s end date. In the query, I’m using a left join with a third not exists join/clause to make sure that the runEnd is the very next timestamp.

I frequently gain insights on my work either when I’m sleeping, when I’m on my commute, or when I’m at the toilet. This was one of the latter. I decided, why not try min(timestamp) on the second table and eliminate the not exists clause?

This didn’t initially work – it got a different number of rows. But upon further investigation, turns out there were some duplicate rows being returned by the old query. So the new one is actually more correct! With this change, the entire query went from 24 seconds to 6.

Leave a comment