Tuesday, September 29, 2009

SQL Performance Test Harness

SQL performance test harness.

Occasionally, I like to be able to test concepts in SQL server and be able to tell if a particular method is faster or slower. For example, a question might be "are table variables or temp tables more efficient and under what circumstances?" What I have done is to create a small testing harness for the purposes of testing various concepts. In the example below, the test I wished to perform involved insert triggers vs default field values. I had run across a colleague’s code where he had created an insert/update trigger to set a last modified field value. The trigger looked something like this.



ALTER TRIGGER [dbo].[tr_UpdateDateTime]
ON [dbo].[tbl_DataTable]
FOR INSERT,UPDATE
AS
BEGIN
UPDATE dbo.tbl_DataTable SET LastUpdated = GETDATE()
FROM dbo.tbl_DataTable DT
INNER JOIN Inserted I ON DT.int_ID = I.int_ID
END


This particular table actually doesn't see much in the way of updates. Most of the activity on this table is through inserts. In my opinion, we could speed up the operations on this table by making the field in question have a default value of GetDate() and then making this trigger be an UPDATE trigger only. In order to backup my assertion, I had to have some concrete numbers. Here are the steps that I used.

1. Create a test table to mimic the production table.
2. Setup an identical trigger on the test table.
3. Run a test of X number of inserts and record how long it took.
4. Remove the INSERT portion of the trigger.
5. Add a default value of GetDate() to the table.
6. Run the test again and record the results.
7. Compare and see if I was right.

In order to gather accurate results, I was going to have to measure the time taken using the SQL environment itself. One other thing to consider is the SQL caching mechanism. In other words, the first run of a piece of code will take longer than subsequent runs. In order to account for this, I had to setup a mechanism to repeat the test over and over and then gather an average of the results. The SQL code below will allow for me to execute my test and gather the needed statistics. Here is the result.



Set Nocount On

-- Holds the time we started a given run
Declare @StartTime As DateTime
-- One of the variables in our test.
Declare @RecordCount as Integer
-- Holds the stats for each run.
Declare @TimingTable as Table
(Run int, -- Records which test run we are on.
Tme BigInt, -- Time for that run in MS
Iter Int, -- Iterations of the target statement.
TmePerCall Decimal(18,10)) -- Avg MS per Run

-- This is how many times we are going to
-- execute the test and collect stats
Declare @RunCount int
Set @RunCount = 1 -- Start at 1

While @RunCount < = 5 -- End at 5 tests.
Begin
Set @StartTime = GetDate()
Set @RecordCount = 1

-- For my test, I want to start with an
-- empty table on each iteration
Truncate Table PerfTest
While @RecordCount <= 10000
Begin

/********************************************/
/* This is the actual Test code that I want */
/* to execute multiple times */
/********************************************/
Insert Into PerfTest (F2,F3,F4,F5)
Values ('Some Text','Some Longer Text',
@RecordCount, NEWID())

-- Increment Counter
Set @RecordCount = @RecordCount + 1
End

-- Record stats for this run.
Insert Into @TimingTable (Run,Tme,Iter)
-- Difference to be recorded in Milliseconds.
Values (@RunCount, DateDiff(ms,@StartTime,getDate()), @RecordCount -1)

-- Increment Counter
Set @RunCount = @RunCount + 1
End

Update @TimingTable
Set TmePercall = convert(Decimal(18,10),Tme) /
convert(Decimal(18,10),Iter)
Select * From @TimingTable
Select COUNT(*) Runs, AVG(Tme) AverageTime,
AVG(TmePercall) AvgTimePerCall From @TimingTable


Let's break down the script. to begin with, we declare the @StartTime. This will be a time date variable that will record the start of any given test. The @RecordCount variable will the counter for the number of records we will insert on any given run. The heart of the statistics gathering is the @TimingTable. This table variable will hold the results of the runs so that we can query them later for information about our test. The outer loop based on @RunCount is the number of times we wish to execute our test. This can usually be a low number (under 10) and is there to counteract the effects of SQL caching. Within the RunCount loop, I truncate my table and then enter a second loop where I can perform inserts (which is what I am testing in this case). This loop may be much larger because I want to simulate bulk loads of the system. As I exit the @RecordCount loop, that is when I gather the statistics in the @TimingTable. Finally, I increment the run count and do it all again.

At the end of the test, I have now gathered information in the @TimingTable. Here is what each field represents.

Run = Which run we are dealing with. This maps back to the @RunCount variable.
Tme = The amount of time, in Milliseconds, the run took.
Iter = The number of iterations in this particular run. This maps back to the @RecordCount variable.
TmePerCall = This is manufactured at the end of the test to give an average of how long each call took in MS.

Once all the stats are gathered, I can take averages of all the fields and get the final results. Using this tool, I found out that the trigger method takes almost twice as long as the default field value. This corraborated my theory and gave me some hard numbers to go back to my co-worker and make the case for making the change. In the end, the final goal is to produce fast, efficient code that meets the specifications. It was a quick change that sped up the system and everyone was able to see a better way to do it.

You can take the above routine and modify it to fit your own needs. Hope this helps.