Query Plans In The Race Of Extended Events

In the previous articles we took a look at some simple ways to hunt for those valuable query plans and then at how we can accomplish the same using SQL Trace. There is yet another way of discovering and extracting query plans from an in-use SQL Server and that is to use Extended Events.

Continue reading “Query Plans In The Race Of Extended Events”

Advertisements

SQL Trace And The Stream Of Query Plans

In 8 Ways To Hunt T-SQL Query Plans, we looked at some simple ways to get our hands on those valuable query plans. However, sometimes we might want to record query plans as part of a bigger effort to profile and troubleshoot a server, or perhaps optimize it for a given workload.

When this need arises, there are two additional tools we can make use of:

  • SQL Trace
  • Extended Events

In this pumpkin we’ll take a look at how we can use SQL Trace to accomplish this goal.

Continue reading “SQL Trace And The Stream Of Query Plans”

8 Ways To Hunt T-SQL Query Plans

In How Can Query Plans Help Us? we looked at how query plans can be made to help us troubleshoot a runaway query. Then in Are Query Plans All We Need? we looked at other tools we need to keep ready in our performance tuning utility belt. With that in mind, it is now time to understand how we can capture query plans to begin with.

So Where Do We Start?

To understand how we can capture query plans, we must first understand what types of query plans are there to capture. Luckily, there is only two of them.

Continue reading “8 Ways To Hunt T-SQL Query Plans”

How Can Query Plans Help Us?

In T-SQL Query Plan Analysis Prelude we looked at some common reasons why queries can sometimes go on a snail speed contest and that query plans are one of the ways we can know how to transform those carefree molluscs into caffeinated cheetahs.

So how exactly can query plans help us do that? Are they like magic wands that turn over-ripe pumpkins into dubious-expiry carriages and make used teeth disappear in return for equally used pennies?

Well… Hold that thought for a minute.

Continue reading “How Can Query Plans Help Us?”

T-SQL Query Plan Analysis Prelude

Have you ever found yourself pulling your own hair with both hands, desperate to understand why that stubborn T-SQL query runs so desperately slow? Were you intrigued by the cool if complex visuals of its query plan? Did you try to turn its knowledge into insight but couldn’t figure out a Hash Match from a Nested Loop?

Query Plan Example

If so, this series might be for you.

Continue reading “T-SQL Query Plan Analysis Prelude”

How To Add NonClustered ColumnStore Indexes To Every Table

Sometimes I feel really lazy. I’m just tinkering with a proof of concept database, a decent amount of test data, I just want to do some moderately fast querying over it and don’t really feel like going about creating proper indexes for everything just yet. I really only want to slap a ColumnStore on everything at once so I can get on with work and run some fast questions over the data.

Luckily, that’s what the cheat script below is for:

/* set the prefix for the indexes here */
DECLARE @IndexPrefix NVARCHAR(MAX) = 'TEST_NCCSI_';

/* dont worry about these */
DECLARE @SQL NVARCHAR(MAX) = '';
DECLARE @NL NVARCHAR(MAX) = NCHAR(13) + NCHAR(10);
DECLARE @TB NVARCHAR(MAX) = NCHAR(9);

/* and off we go */
WITH Query AS
(
	SELECT
		SchemaId = S.schema_id,
		SchemaName = S.name,
		TableId = T.object_id,
		TableName = T.name,
		ColumnId = C.column_id,
		ColumnName = C.name,
		IsFirstColumn = CASE WHEN C.column_id = MIN(C.column_id) OVER (PARTITION BY S.schema_id, T.object_id) THEN 1 ELSE 0 END,
		IsLastColumn = CASE WHEN C.column_id = MAX(C.column_id) OVER (PARTITION BY S.schema_id, T.object_id) THEN 1 ELSE 0 END
	FROM
		sys.tables AS T
		INNER JOIN sys.schemas AS S
			ON S.schema_id = T.schema_id
		INNER JOIN sys.columns AS C
			ON C.object_id = T.object_id
)
SELECT
	@SQL +=

		/* add ddl statement with first column */
		CASE WHEN Q.IsFirstColumn = 1 THEN
				'CREATE NONCLUSTERED COLUMNSTORE INDEX ' + QUOTENAME(@IndexPrefix + Q.TableName) + ' ON ' + QUOTENAME(Q.SchemaName) + '.' + QUOTENAME(Q.TableName) + @NL +
				'('
		ELSE
			''
		END

		/* add column name */
		+
		@NL + @TB + QUOTENAME(Q.ColumnName)

		/* add comma after interim columns */
		+
		CASE WHEN Q.IsLastColumn = 0 THEN ',' ELSE '' END

		/* close ddl after last column */
		+
		CASE WHEN Q.IsLastColumn = 1 THEN @NL + ')' + @NL + @NL ELSE '' END
FROM
	Query AS Q
ORDER BY
	SchemaId,
	TableId,
	ColumnId
;

SELECT @SQL;
PRINT @SQL;

Note that this straightforward script doesn’t try to partition align the indexes, detect invalid columns or any other fancy stuff. If you do end up adding those details, feel free to comment and I’ll add those in too.

How To List Table Row Counts And Index Statistics

On getting an ETL performance testing system of some sort up and running, one of the Socratic questions that arises at some point is:

“When we run a workload, how much data is there in the database and how does it affect the workload itself?”

OK, so not as Socratic as “What is justice?”, as we can actually measure this one.

The starting point being, of course, to identify that data in the first place.
The snippets below provide a starting point for getting that data out of SQL Server.

Row Counts Only

The starting point can be to just list row counts of all tables in the target database.
The short example below will do just that.

-- tables only, clustered or heap, nothing else
SELECT
	S.name As SchemaName,
	T.name As TableName,
	SUM (P.rows) AS Rows
FROM
	sys.tables AS T
	INNER JOIN sys.schemas AS S
		ON S.schema_id = T.schema_id
	INNER JOIN sys.partitions AS P
		ON P.object_id = T.object_id
WHERE
	T.type = 'U' -- user tables only
	AND P.index_id IN (0, 1) -- HEAP and CLUSTERED only
GROUP BY
	S.name,
	T.name
ORDER BY
	SchemaName,
	TableName
;

Rows Counts Per Partition

Next, it can be interesting to look at row counts per partition. This might be helpful if you’re running some crazy updates on specific partitions as part of your ETL process and you’d like to check whether it makes too big a difference to run those over low populated partitions vs highly populated ones.

-- tables only, clustered or heap, counts by partition
SELECT
	S.name As SchemaName,
	T.name As TableName,
	P.partition_number AS PartitionNumber,
	P.rows AS Rows
FROM
	sys.tables AS T
	INNER JOIN sys.schemas AS S
		ON S.schema_id = T.schema_id
	INNER JOIN sys.partitions AS P
		ON P.object_id = T.object_id
WHERE
	T.type = 'U' -- user tables only
	AND P.index_id IN (0, 1)-- HEAP and CLUSTERED only
ORDER BY
	SchemaName,
	TableName,
	PartitionNumber
;

Row Counts Per Partition & Index

It can also be interesting to see if ETL performance is affected by loads on top of tables with growing indexes, both in number and size.
This can go both ways, of course, but it also prepares for the next step.

-- tables, clustered or heap, plus indexes, counts by partition
SELECT
	S.name As SchemaName,
	T.name As TableName,
	P.partition_number AS PartitionNumber,
	P.index_id AS IndexId,
	I.name AS IndexName,
	I.type_desc AS IndexType,
	P.rows AS Rows
FROM
	sys.tables AS T
	INNER JOIN sys.schemas AS S
		ON S.schema_id = T.schema_id
	INNER JOIN sys.partitions AS P
		ON P.object_id = T.object_id
	INNER JOIN sys.indexes AS I
		ON I.object_id = P.object_id
		AND I.index_id = P.index_id
WHERE
	T.type = 'U' -- user tables only
ORDER BY
	SchemaName,
	TableName,
	PartitionNumber,
	IndexId
;

Row Counts Per Partition & Index, Plus Index Statistics

If we’re not yet content with the previous data, we can roll up our sleeves by digging into the index stats themselves.
Ask yourself, how many times did you see that T-SQL-done-ETL run incredibly slow and then magically fix itself when the underlying tables got rebuilt? As in REBUILD, really.
Fragmentation can have a big impact on any query, and ETL-ities are no exception.
The example below will grab basic fragmentation stats from all indexes so you can see if that’s a possible cause of red herrings in your workload.
Be careful running this query on a production system during business hours, it can quite heavy, especially if you pump up the call to sys.dm_db_index_physical_stats to increase the scan accuracy.

-- tables, clustered or heap, plus indexes, and physical stats counts by partition
SELECT
	S.name As SchemaName,
	T.name As TableName,
	P.partition_number AS PartitionNumber,
	P.index_id AS IndexId,
	I.name AS IndexName,
	I.type_desc AS IndexType,
	P.rows AS Rows,
	PIS.alloc_unit_type_desc AS IndexAllocationUnitType,
	PIS.index_depth AS IndexDepth,
	PIS.index_level AS IndexLevel,
	PIS.avg_fragmentation_in_percent AS IndexAverageFragmentationPercent,
	PIS.fragment_count AS IndexFragmentCount,
	PIS.avg_fragment_size_in_pages AS IndexAverageFragmentSizeInPages,
	PIS.page_count AS IndexPageCount
FROM
	sys.tables AS T
	INNER JOIN sys.schemas AS S
		ON S.schema_id = T.schema_id
	INNER JOIN sys.partitions AS P
		ON P.object_id = T.object_id
	INNER JOIN sys.indexes AS I
		ON I.object_id = P.object_id
		AND I.index_id = P.index_id
	CROSS APPLY sys.dm_db_index_physical_stats(
		/* database_id */ DB_ID(),
		/* object_id */ T.object_id,
		/* index_id */ I.index_id,
		/* partition_number */ P.partition_number,
		/* mode */ NULL) AS PIS
WHERE
	T.type = 'U' -- user tables only
ORDER BY
	SchemaName,
	TableName,
	PartitionNumber,
	IndexId
;

All For All

At some point you might also need to grab a snapshot of some or all the databases in your server for workload documentation purposes.
Doing this on all databases one by one can be time-consuming.
The example below lets you skip the RBAR pain and grab the same stats as the previous example, but for all the databases you need at once.
Remember to add a WHERE+LIKE filter on the first block to select only the database name patterns you want.
Again, be weary of running this on a production system during business hours.

-- list databases to loop
DECLARE @Databases TABLE
(
	DatabaseKey INT IDENTITY (1, 1) NOT NULL PRIMARY KEY CLUSTERED,
	DatabaseName SYSNAME NOT NULL
)
INSERT INTO @Databases (DatabaseName)
SELECT name FROM sys.databases;

-- this will hold the final result
IF OBJECT_ID('tempdb.dbo.#Data') IS NOT NULL
DROP TABLE #Data;

CREATE TABLE #Data
(
	DatabaseName SYSNAME NOT NULL,
	SchemaName SYSNAME NOT NULL,
	TableName SYSNAME NOT NULL,
	PartitionNumber INT NOT NULL,
	IndexId INT NOT NULL,
	IndexName SYSNAME NULL,
	IndexType SYSNAME NOT NULL,
	Rows BIGINT NOT NULL,
	IndexAllocationUnitType SYSNAME NULL,
	IndexDepth TINYINT NULL,
	IndexLevel TINYINT NULL,
	IndexAverageFragmentationPercent FLOAT NULL,
	IndexFragmentCount BIGINT NULL,
	IndexAverageFragmentSizeInPages FLOAT NULL,
	IndexPageCount BIGINT NULL,

	PRIMARY KEY CLUSTERED
	(
		DatabaseName,
		SchemaName,
		TableName,
		PartitionNumber,
		IndexId
	)
);

DECLARE @DatabaseKey INT = 1;
DECLARE @MaxDatabaseKey INT = (SELECT MAX(DatabaseKey) FROM @Databases);
WHILE @DatabaseKey <= @MaxDatabaseKey
BEGIN

	DECLARE @DatabaseName SYSNAME =
	(
		SELECT
			DatabaseName
		FROM
			@Databases
		WHERE
			DatabaseKey = @DatabaseKey
	);

	DECLARE @SQL NVARCHAR(MAX) = '
		INSERT INTO #Data
		(
			DatabaseName,
			SchemaName,
			TableName,
			PartitionNumber,
			IndexId,
			IndexName,
			IndexType,
			Rows,
			IndexAllocationUnitType,
			IndexDepth,
			IndexLevel,
			IndexAverageFragmentationPercent,
			IndexFragmentCount,
			IndexAverageFragmentSizeInPages,
			IndexPageCount
		)
		SELECT
			''' + @DatabaseName + ''' AS DatabaseName,
			S.name As SchemaName,
			T.name As TableName,
			P.partition_number AS PartitionNumber,
			P.index_id AS IndexId,
			I.name AS IndexName,
			I.type_desc AS IndexType,
			P.rows AS Rows,
			PIS.alloc_unit_type_desc AS IndexAllocationUnitType,
			PIS.index_depth AS IndexDepth,
			PIS.index_level AS IndexLevel,
			PIS.avg_fragmentation_in_percent AS IndexAverageFragmentationPercent,
			PIS.fragment_count AS IndexFragmentCount,
			PIS.avg_fragment_size_in_pages AS IndexAverageFragmentSizeInPages,
			PIS.page_count AS IndexPageCount
		FROM
			' + QUOTENAME(@DatabaseName) + '.sys.tables AS T
			INNER JOIN ' + QUOTENAME(@DatabaseName) + '.sys.schemas AS S
				ON S.schema_id = T.schema_id
			INNER JOIN ' + QUOTENAME(@DatabaseName) + '.sys.partitions AS P
				ON P.object_id = T.object_id
			INNER JOIN ' + QUOTENAME(@DatabaseName) + '.sys.indexes AS I
				ON I.object_id = P.object_id
				AND I.index_id = P.index_id
			CROSS APPLY sys.dm_db_index_physical_stats(
				/* database_id */ DB_ID(''' + @DatabaseName + '''),
				/* object_id */ T.object_id,
				/* index_id */ I.index_id,
				/* partition_number */ P.partition_number,
				/* mode */ NULL) AS PIS
		WHERE
			T.type = ''U'' -- user tables only
		ORDER BY
			SchemaName,
			TableName,
			PartitionNumber,
			IndexId
		;
		';

	EXECUTE (@SQL);

	SET @DatabaseKey += 1;

END

SELECT * FROM #Data;

How To Extract Serialized Data From String In One Go In T-SQL

Consider the following data:

DECLARE @OrderString VARCHAR(MAX) = '1=1;|2=|3=|4=1;|5=|';

This data was being returned in an externally generated XML file, so out of the control of the developer.

Each element of information is delimited by a pipe (|).
Within each element, the first number, to the left the equal sign (=), represented an “Item Number”.
The second number, to the right of the equal sign, represented an “Order Quantity”.
This could be any number, and if empty, it meant zero.

The developer needed to return this data in form of a table.
He also needed to return only the rows with an Item Quantity greater than zero.

In T-SQL, this conversion can be done in one go, using a single query.
Let’s go through the steps.

Remove Unnecessary Characters

First of all, if take a look at the source data, we can clearly see an oddity there.
The semi-colon character seems to appear only after and when there is a declared Item Value.
This is extra information we don’t need, as we already have each block delimited by the pipe character.
So let’s take this out of the equation:

WITH Step0 AS
(
	SELECT
		String = REPLACE(@OrderString, ';', '')
),

Identify Elements

Now, to make our lives easier, let’s split each element, or, “block” into a separate row.
This can be accomplished with a good old recursive CTE:

Step1 AS
(
	SELECT
		Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
		String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
	FROM
		Step0

	UNION ALL
	
	SELECT
		Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
		String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
	FROM
		Step1
	WHERE
		LEN(String) > 0
),

Identify Columns

Now that we have a row identified for each block, we can then extract the columns themselves:

Step2 AS
(
	SELECT
		ItemNumber = SUBSTRING(Block, 1, CHARINDEX('=', Block) - 1),
		ItemValue = SUBSTRING(Block, CHARINDEX('=', Block) + 1, LEN(Block))
	FROM
		Step1
),

Convert & Filter

Finally, now with proper columns in place, we can filter out the zero quantity items:

Step3 AS
(
	SELECT
		ItemNumber = CAST(ItemNumber AS INT),
		ItemValue = CAST(ItemValue AS INT)
	FROM
		Step2
	WHERE
		CAST(ItemValue AS INT) > 0
)

SELECT
	*
FROM
	Step3;

And that was it. Nothing to it.

Full Code

DECLARE @OrderString VARCHAR(MAX) = '1=1;|2=|3=|4=1;|5=|';

WITH Step0 AS
(
	SELECT
		String = REPLACE(@OrderString, ';', '')
),
Step1 AS
(
	SELECT
		Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
		String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
	FROM
		Step0

	UNION ALL
	
	SELECT
		Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
		String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
	FROM
		Step1
	WHERE
		LEN(String) > 0
),

Step2 AS
(
	SELECT
		ItemNumber = SUBSTRING(Block, 1, CHARINDEX('=', Block) - 1),
		ItemValue = SUBSTRING(Block, CHARINDEX('=', Block) + 1, LEN(Block))
	FROM
		Step1
),

Step3 AS
(
	SELECT
		ItemNumber = CAST(ItemNumber AS INT),
		ItemValue = CAST(ItemValue AS INT)
	FROM
		Step2
	WHERE
		CAST(ItemValue AS INT) > 0
)

SELECT
	*
FROM
	Step3;

How To Estimate The Compression Ratio For All Tables In Database

Wondering how much space you could save by compression all those disk eating tables?

This snippet will look at all the tables in your database and estimate how much you would save by applying ROW or PAGE compression on them.
There is nothing special about to it, it just makes use of the sp_estimate_data_compression_savings on all your user tables and organizes the results.

The Code

SET NOCOUNT ON;

DECLARE @Objects TABLE
(
	ItemKey INT IDENTITY (1, 1) PRIMARY KEY CLUSTERED,
	SchemaName SYSNAME NOT NULL,
	ObjectName SYSNAME NOT NULL,
	IndexID INT NOT NULL,
	PartitionNumber INT NOT NULL,
	CurrentCompressionType SYSNAME NOT NULL,
	EstimatedSizeWithNoCompressionKB INT NULL,
	EstimatedSizeWithRowCompressionKB INT NULL,
	EstimatedSizeWithPageCompressionKB INT NULL,
	EstimatedCompressionRatioRowVsNone AS 100 - ((EstimatedSizeWithRowCompressionKB * 100) / (NULLIF(EstimatedSizeWithNoCompressionKB, 0))),
	EstimatedCompressionRatioPageVsNone AS 100 - ((EstimatedSizeWithPageCompressionKB * 100) / (NULLIF(EstimatedSizeWithNoCompressionKB, 0)))
);

INSERT INTO @Objects
(
	SchemaName,
	ObjectName,
	IndexID,
	PartitionNumber,
	CurrentCompressionType
)
SELECT
	S.name,
	T.name,
	I.index_id,
	P.partition_number,
	P.data_compression_desc
FROM
	sys.tables AS T
	INNER JOIN sys.schemas AS S
		ON S.schema_id = T.schema_id
	INNER JOIN sys.indexes AS I
		ON I.object_id = T.object_id
	INNER JOIN sys.partitions AS P
		ON P.object_id = T.object_id
		AND P.index_id = I.index_id
WHERE
	T.type = 'U'
ORDER BY
	S.name,
	T.name,
	I.index_id,
	P.partition_number
;

DECLARE @Results TABLE
(
	ObjectName SYSNAME NOT NULL,
	SchemaName SYSNAME NOT NULL,
	IndexID INT NOT NULL,
	PartitionNumber INT NOT NULL,
	SizeWithCurrentCompression INT NOT NULL,
	SizeWithRequestedCompression INT NOT NULL,
	SampleSizeWithCurrentCompression INT NOT NULL,
	SampleSizeWithRequestedCompression INT NOT NULL
);

DECLARE @ItemKey INT = 1;
DECLARE @LastKey INT = (SELECT MAX(ItemKey) FROM @Objects);
WHILE @ItemKey <= @LastKey
BEGIN

	DECLARE @SchemaName SYSNAME, @ObjectName SYSNAME, @IndexID INT, @PartitionNumber INT;
	SELECT
		@SchemaName = SchemaName,
		@ObjectName = ObjectName,
		@IndexID = IndexID,
		@PartitionNumber = PartitionNumber
	FROM
		@Objects
	WHERE
		ItemKey = @ItemKey;



	DELETE FROM @Results;
	INSERT INTO @Results
	EXECUTE sys.sp_estimate_data_compression_savings
		@schema_name = @SchemaName,
		@object_name = @ObjectName,
		@index_id = @IndexID,
		@partition_number = @PartitionNumber,
		@data_compression = 'NONE';
	UPDATE O
	SET
		O.EstimatedSizeWithNoCompressionKB = R.SizeWithRequestedCompression
	FROM
		@Objects AS O
		CROSS JOIN @Results AS R
	WHERE
		O.ItemKey = @ItemKey
		

	DELETE FROM @Results;
	INSERT INTO @Results
	EXECUTE sys.sp_estimate_data_compression_savings
		@schema_name = @SchemaName,
		@object_name = @ObjectName,
		@index_id = @IndexID,
		@partition_number = @PartitionNumber,
		@data_compression = 'ROW';
	UPDATE O
	SET
		O.EstimatedSizeWithRowCompressionKB = R.SizeWithRequestedCompression
	FROM
		@Objects AS O
		CROSS JOIN @Results AS R
	WHERE
		O.ItemKey = @ItemKey



	DELETE FROM @Results;
	INSERT INTO @Results
	EXECUTE sys.sp_estimate_data_compression_savings
		@schema_name = @SchemaName,
		@object_name = @ObjectName,
		@index_id = @IndexID,
		@partition_number = @PartitionNumber,
		@data_compression = 'PAGE';
	UPDATE O
	SET
		O.EstimatedSizeWithPageCompressionKB = R.SizeWithRequestedCompression
	FROM
		@Objects AS O
		CROSS JOIN @Results AS R
	WHERE
		O.ItemKey = @ItemKey

	SET @ItemKey += 1;
END

SELECT
	ItemKey,
	SchemaName,
	ObjectName,
	IndexID,
	PartitionNumber,
	CurrentCompressionType,
	EstimatedSizeWithNoCompressionKB,
	EstimatedSizeWithRowCompressionKB,
	EstimatedSizeWithPageCompressionKB,
	EstimatedCompressionRatioRowVsNone,
	EstimatedCompressionRatioPageVsNone
FROM
	@Objects
ORDER BY
	ItemKey
;

Thoughts

Remember to evaluate CPU usage when choosing a compression type and be weary of it if you’re doing big loads into a table in an ETL process of some sort.

How To Mock Database Objects For (Almost) Isolated Unit Testing In SSDT Database Projects

While writing this Stored Procedure I wanted to properly unit test in SSDT, I faced this annoying little issue:

  • My Stored Procedure, let’s call it MainProcedure, had to call another Stored Procedure, let’s call it SubProcedure.
  • The SubProcedure had some dodgy and time-consuming side-effects that I didn’t want to see happen during a Unit Test. Plus, the SubProcedure was not the one being tested anyway.
  • Considering this, I wanted to isolate the code in MainProcedure using a Dependency Injection / Inversion Of Control pattern somehow.
  • Unfortunately, this is not supported out-of-the-box in SSDT, plus to be fair, IoC is not something the T-SQL language lends itself to easily at all: You can’t just pass a type-safe reference to a Stored Procedure in your parameters.

After some head scratching and searching the web for what wasn’t there, I arrived at a couple of options:

  • Don’t do anything and live with it. It’s always an option.
  • Mimic an IoC pattern by passing in a name for a dependent stored procedure… And then end up writing loads of dynamic SQL, thereby undermining the point of using SSDT in the first pace.
  • Drop the SubProcedure before running the test, create a fake and then issue the original CREATE statement again… And end up with loads of code copied all over the place… And what if wanted to mock tables?
  • Or swap the underlying SubProcedure before the test with a stub/fake and then swap it back after completion. Definitely cheating… Definitely Not IoC… But I wondered if it worked…

So let’s see what happened.

Main Procedure

First of all, I created some sample code to represent the MainProcedure.
Notice how this sample is calling the SubProcedure by passing through its received parameters and its return value. This will help us evaluate whether the correct code is being called later on.

CREATE PROCEDURE [dbo].[MainProcedure]
	@Param1 INT,
	@Param2 INT
AS

PRINT 'Running MainProcedure...';

PRINT 'Calling SubProcedure...';

DECLARE @ReturnValue INT;
EXECUTE @ReturnValue = dbo.SubProcedure
	@Param1 = @Param1,
	@Param2 = @Param2;

PRINT 'Done Calling SubProcedure...';

PRINT 'Done Running MainProcedure...';

RETURN @ReturnValue;

Sub Procedure

Then I created some sample code for the SubProcedure.
The WAITFOR was just to make it really obvious the sluggish procedure was being called, you can take it out if you wish.

CREATE PROCEDURE [dbo].[SubProcedure]
	@Param1 INT,
	@Param2 INT
AS

	PRINT 'Running SubProcedure...';

	/* do something really heavy here */
	WAITFOR DELAY '00:00:05';
	/* done */

	PRINT 'Done Running SubProcedure...';

RETURN 0

Test

Now it was the time to see if I could swap the SubProcedure with a mock.
I created an SSDT Unit Test for the MainProcedure, and then went on to write some testing code.

Below is what I put on the Test code block, to start with.
I also added a Scalar Value Condition to the Test, to verify that the @ResultValue would return the value 30. I would then make the Mock SubProcedure return this as opposed to the simple zero of the original code. This would provide evidence that the mock code was being called.

SET XACT_ABORT ON
BEGIN TRANSACTION

BEGIN TRY

	-- ARRANGE
	DECLARE @Param1 INT = 10;
	DECLARE @Param2 INT = 20;
	DECLARE @ResultValue INT;

	-- ACT
	EXECUTE @ResultValue = dbo.MainProcedure
		@Param1 = @Param1,
		@Param2 = @Param2;

	-- ASSERT
	SELECT @ResultValue;

END TRY
BEGIN CATCH

	ROLLBACK;
	THROW;

END CATCH;

Pre-Test

So now it was time for some good old cheating.
I stuffed the code below in the Pre-Test phase of this Unit Test.
This creates a Mock at run time, just like you would do in plain .NET Unit Tests, and then swaps it with the original code, whilst backing it up too.
Notice we have to resort to using Dynamic SQL here (sigh…) because we can’t include multiple clean batches in a SQL Server Unit Test. (e.g. no GO for you)

SET XACT_ABORT ON
BEGIN TRANSACTION

BEGIN TRY

	-- create a mock object
	EXECUTE sp_executesql N'
	CREATE PROCEDURE dbo.SubProcedure_Mock
		@Param1 INT,
		@Param2 INT
	AS
		PRINT ''Mock SubProcedure Called With { @Param1 = '' + CAST(@Param1 AS VARCHAR(10)) + '', @Param2 = '' + CAST(@Param2 AS VARCHAR(10)) + '' }'';
		RETURN @Param1 + @Param2;
	';

	-- swap out the original object
	EXECUTE sp_rename 'SubProcedure', 'SubProcedure_Original';

	-- swap int the mock object
	EXECUTE sp_rename 'SubProcedure_Mock', 'SubProcedure';

END TRY
BEGIN CATCH

	ROLLBACK;
	THROW;

END CATCH

COMMIT;

Post-Test

To undo the blatant cheating of the Pre-Test phase, I also added the code below to the Post-Test phase.
This swaps in the original code and then drops the mocked procedure that was created earlier.
I couldn’t help but to keep recalling a known issue with the Post-Test phase though: it wouldn’t run if the Unit Test code failed in the first place.
But I would get back to that in a minute.

SET XACT_ABORT ON
BEGIN TRANSACTION

BEGIN TRY

	-- swap out the mock object
	EXECUTE sp_rename 'SubProcedure', 'SubProcedure_Mock';

	-- swap in the original object
	EXECUTE sp_rename 'SubProcedure_Original', 'SubProcedure';

	-- remove the mock object
	EXECUTE sp_executesql N'DROP PROCEDURE dbo.SubProcedure_Mock';

END TRY
BEGIN CATCH

	ROLLBACK;
	THROW;

END CATCH;

COMMIT;

Results

This actually worked better than expected.
When I ran the MainProcedure on its own in SSMS, like so:

-- ARRANGE
DECLARE @Param1 INT = 10;
DECLARE @Param2 INT = 20;
DECLARE @ResultValue INT;

-- ACT
EXECUTE @ResultValue = dbo.MainProcedure
	@Param1 = @Param1,
	@Param2 = @Param2;

I got the results I was expecting:

Running MainProcedure...
Calling SubProcedure...
Running SubProcedure...
Done Running SubProcedure...
Done Calling SubProcedure...
Done Running MainProcedure...

However, when running the all the test phases together (by copy/pasting from the Unit Test Editor), these were the results:

Running MainProcedure...
Calling SubProcedure...
Mock SubProcedure Called With { @Param1 = 10, @Param2 = 20 }
Done Calling SubProcedure...
Done Running MainProcedure...

It was working! This unclean cheating was actually working!
Running the Test itself in Visual Studio also proved to work, and with Scalar Value Condition being satisfied too.

Test Cleanup

Now there was just one issue to resolve. It is a fact that the in the Post-Test will only run if your test does not throw errors or fails execution otherwise. I tested this to be true by changing the code of MainProcedure to THROW a fake error:

CREATE PROCEDURE [dbo].[MainProcedure]
	@Param1 INT,
	@Param2 INT
AS

PRINT 'Running MainProcedure...';

PRINT 'Calling SubProcedure...';

DECLARE @ReturnValue INT;
EXECUTE @ReturnValue = dbo.SubProcedure
	@Param1 = @Param1,
	@Param2 = @Param2;

THROW 50000, 'Something went wrong...', 1;

PRINT 'Done Calling SubProcedure...';

PRINT 'Done Running MainProcedure...';

RETURN @ReturnValue;

Running the Unit Test now left me with a left-over SubProcedure_Original object in the database, also causing subsequent test runs to fail.
However, I didn’t want to bring the Pre-Test and Post-Test code together to the Test block, as these phases are meant to be run under different user credentials.
Fortunately, this is exactly what the Test Cleanup phase is for.
This phase will always run after all the Unit Tests in a given file, regardless if they fail or not. That’s the point of it: To clean up the mess!

So I added the code below to the Test Cleanup phase:

SET XACT_ABORT ON
BEGIN TRANSACTION

BEGIN TRY

	-- check if the original objects exists as a backup
	IF OBJECT_ID('SubProcedure_Original') IS NOT NULL
	BEGIN

		-- check if the current object is present and drop it
		IF OBJECT_ID('SubProcedure') IS NOT NULL
		BEGIN
			EXECUTE sp_executesql N'DROP PROCEDURE SubProcedure';
		END

		-- swap in the original object
		EXECUTE sp_rename 'SubProcedure_Original', 'SubProcedure';

	END

END TRY
BEGIN CATCH

	ROLLBACK;
	THROW;

END CATCH;

COMMIT;

And voilĂ ! Everything now ran perfectly!

Thoughts

SSDT Unit Testing is getting there. It doesn’t yet go head-to-head with the full functionality of tSQLt, like its dependency isolation capabilities (which, technically, are glorified cheating mechanisms), but it’s definitely getting there, and I hope some easier mocking abilities will get added soon.

However, it looks likely that we’ll never see true IoC development in T-SQL unless some major restructuring is done to the language itself. And why break something that works? That doesn’t mean we can’t get stuff done though.