Wednesday, October 1, 2008

Trick Questions

You receive some data from UI and need to populate table Customer if the name received from UI does not exist in database. Which of options below should be used for this purpose:

1.

IF NOT EXISTS ( SELECT * FROM Customer
WHERE Name = @CustomerName )
INSERT INTO [Customer]
( [Name])
VALUES
( @CustomerName)

2.

INSERT INTO [Customer]
( [Name])
SELECT @CustomerName
WHERE NOT EXISTS ( SELECT * FROM Customer
WHERE Name = @CustomerName )



Correct answer:
Option 2 is the one to use
Explanation:
Option 1 has 2 separate statements. The first one applies locks relevant table resources in order to perform check for existence. As soon as it's completed SQL Server releases all applied locks. At the moment parallel SPID may insert a record into the table BEFORE following INSERT statement has applied its locks.

Therefore initial check may be irrelevant at the moment when following INSERT starts. IF there is a unique constraint on Customer.Name INSERT would fail despite you've done the check for existence.

The second option does everything in a single transaction. It applies the locks and holds it until INSERT transaction has finished its job. Another SPID cannot insert another row until all locks on the object applied by INSERT statement are released.

So, the first option is unacceptable, it relies on a user's luck not to have another user doing same thing at the same time. Of course, probability of the failure is quite low but it's a possible event. In active transactional systems with hundreds of transactions per second it does not look so improbable.

Script to run the test:

USE pubs
GO
CREATE TABLE Customer (
ID int IDENTITY(1,1) NOT NULL,
Name nvarchar(100) NOT NULL,
UNIQUE (Name)
)
GO
DECLARE @CustomerName nvarchar(100)
SET @CustomerName = 'Customer2'

IF NOT EXISTS ( SELECT * FROM Customer
WHERE Name = @CustomerName )
BEGIN
WAITFOR DELAY '00:00:05'
/* this 5 sec pause lets you insert same row from another QA window:
DECLARE @CustomerName nvarchar(100)
SET @CustomerName = 'Customer2'
INSERT INTO [Customer]
( [Name])
VALUES
( @CustomerName)
*/

INSERT INTO [Customer]
( [Name])
VALUES
( @CustomerName)

END

SET @CustomerName = 'Customer3'
INSERT INTO [Customer]
( [Name])
SELECT @CustomerName
WHERE NOT EXISTS ( SELECT * FROM Customer
WHERE Name = @CustomerName )

GO
select Object_Id('Customer')

DROP TABLE Customer




What will be the output?

Declare @var int
Select @var = isnull(@var,1) + Value1
From (Select 1 Value1 Union All Select 1 Union All Select 2) as a
Select @var

Answer: 5, the sql behave like v=v+1




What will be the count against each category? Each row in the result is shown as a pair in the answers.

CREATE TABLE #CATEGORY
(CATID INT
,VAL1 INT)

INSERT INTO #CATEGORY VALUES(1,NULL)
INSERT INTO #CATEGORY VALUES(2,1)
INSERT INTO #CATEGORY VALUES(3,2)

CREATE TABLE #DATA
(VAL1 INT)

INSERT INTO #DATA VALUES(1)
INSERT INTO #DATA VALUES(1)
INSERT INTO #DATA VALUES(1)
INSERT INTO #DATA VALUES(2)
INSERT INTO #DATA VALUES(2)
INSERT INTO #DATA VALUES(3)

SELECT C.CATID, COUNT(*)
FROM #DATA D
INNER JOIN #CATEGORY C ON C.VAL1 = D.VAL1 OR C.VAL1 IS NULL
GROUP BY C.CATID


Answer: 1 6, 2 3, 3 2




Simulating memory pressure

You have been asked to optimize a stored procedure that runs against a terabyte-sized database. The stored procedure executes several steps consecutively. The performance problems appear to be mainly I/O related.

You install a severely trimmed down test version of the database (1 GB in size) on your desktop computer running SQL Server Developer Edition. Before you start optimizing, you want to establish a baseline by timing the stored procedure on your development machine, so that you can later compare performance after adding indexes and tweaking code.

However, your desktop has 2 GB of memory installed, and you are concerned that the performance test results may be skewed because the test version of the database fits entirely in cache. What is the best way to simulate the production circumstances as closely as possible?



Correct answer:
Execute sp_configure and RECONFIGURE to limit server memory to 512 MB, then restart the SQL Server service.
Explanation:
Executing "EXEC sys.sp_configure N'max server memory (MB)', N'512'" followed by "RECONFIGURE WITH OVERRIDE" will limit the memory used by your server to 512 MB. Since SQL Server might have already allocated more memory, which will not be released immediately, a reboot will force this memory setting to be respected.

Creating a bigger test database will also work, but involves a lot more work, will cause your tests to run longer, and might require you to clean up your collection of holiday pictures. Removing some RAM from your computer works as well, but will cause all other processes in your computer to slow down as well. Not recommended (especially if you are as clumsy in handling electronic equipment as I am)

Executing DBCC FREEPROCCACHE before starting the tests will cause the FIRST query to run with no data in cache, but the remaining steps of the stored procedure will once more be able to read data from cache. This is not a good simulation of the production circumstances. Starting memory hungry applcations might indeed cause available memory for SQL Server to be reduced. But it will also cause extra strain for the CPU, and you might get SQL Server and the other applications "fighting" for memory or Windows starting to page some applications in and out of memory. Again, not a good simulation of the production circumstances.