Wednesday, February 6, 2008

SQL Server performance tips

Does your SQL statement have a WHERE clause?
I know this sounds obvious, but don't retrieve more data than you need. However, less obvious is that even if your SELECT statement retrieves the same quantity of data without a WHERE clause, it may run faster with one.

Is SELECT DISTINCT being used properly?
Again, pretty obvious, but using SELECT DISTINCT where no duplicate records are being returned is an unnecessary performance hit. If you are getting duplicate records, first double check your table joins as this is often the cause and only use the DISTINCT clause if you really need it.

Are you using UNION instead of UNION ALL?
A UNION statement effectively does a SELECT DISTINCT on the results set. If you know that all the records returned are unique from your union, use UNION ALL instead, it is much quicker.

Are your stored procedures prefixed with 'sp_'?
Any stored procedures prefixed with 'sp_' are first searched for in the Master database rather than the one it is created in. This will cause a delay in the stored procedure being executed.

Are all stored procedures referred to as dbo.sprocname?
When calling a stored procedure you should include the owner name in the call, i.e. use EXEC dbo.spMyStoredProc instead of EXEC spMyStoredProc.

Prefixing the stored procedure with the owner when executing it will stop SQL Server from placing a COMPILE lock on the procedure while it determines if all objects referenced in the code have the same owners as the objects in the current cached procedure plan.

Are you using temporary tables when you don't need to?
Although there is sometimes a benefit of using temporary tables, generally they are best eliminated from your stored procedure. Don't assume that retrieving data multiple times is always less efficient than getting the data once and storing it in temporary table as often it isn't. Consider using a sub-query or derived table instead of a temporary table (see examples below). If you are using a temporary table in lots of JOINS in you stored procedure and it contains loads of data, it might be beneficial to add an index to your temporary table as this may also improve performance.

An example of a derived table instead of a temporary table
SELECT COLUMN1, COLUMN2, COUNTOFCOL3
FROM A_TABLE A
INNER JOIN (SELECT COUNT(COLUMN3) AS COUNTOFCOL3, COLUMN2
FROM B_TABLE B
INNER JOIN C_TABLE C ON B.ID = C.ID) ON A.ID = B.ID

Are you using Cursors when you don't need to?
Cursors of any kind slow down SQL Server's performance. While in some cases they are unavoidable, often there are ways to remove them from your code.

Consider using any of these options instead of using a cursor as they are all faster:
•Derived tables
•Sub-queries
•CASE statements
•Multiple queries
•Temporary tables

Are your Transactions being kept as short as possible?
If you are use SQL transactions, try to keep them as short as possible. This will help db performance by reducing the number of locks. Remove anything that doesn't specifically need to be within the transaction like setting variables, select statements etc.

Is SET NO COUNT ON being used?
By default, every time a stored procedure is executed, a message is sent from the server to the client indicating the number of rows that were affected by the stored procedure. You can reduce network traffic between the server and the client if you don't need this feature by adding SET NO COUNT ON at the beginning of your stored procedure.

Are you using IN or NOT IN when you should be using EXISTS or NOT EXISTS?
If you are using IN or NOT IN in a WHERE clause that contains a sub-query you should re-write it to use either EXISTS, NOT EXISTS or perform a LEFT OUTER JOIN. This is because particularly the NOT IN statement offers really poor performance. The example below probably better explains what I mean:

e.g. This SQL statement:

SELECT A_TABLE.COLUMN1
FROM A_TABLE
WHERE A_TABLE.COLUMN2 NOT IN (SELECT A_TABLE2.COLUMN2
FROM A_TABLE2)

Could be re-written like this:
SELECT A_TABLE.COLUMN1
FROM A_TABLE
WHERE NOT EXISTS (SELECT A_TABLE2.COLUMN2
FROM A_TABLE2
WHERE A_TABLE.COLUMN2 = A_TABLE2.COLUMN2)

Do you have a function that acts directly on a column used in a WHERE clause?
If you apply a function to a column used in the WHERE clause of your SQL statement, it is unlikely that the SQL statement will be able to make use of any indexes applied to that column.

e.g.
SELECT A_TABLE.LASTNAME
FROM A_TABLE
WHERE SUBSTRING (FIRSTNAME,1,1) = 'm'

Could be re-written:
SELECT A_TABLE.LASTNAME
FROM A_TABLE
WHERE FIRSTNAME LIKE = 'm%'

Where you have a choice of using the IN or BETWEEN clauses
Use the BETWEEN clause as it is much more efficient

e.g. This SQL statement:

SELECT A_TABLE.NAME
FROM A_TABLE
WHERE A_TABLE.NUMBER IN (100, 101, 102, 103)

Should be re-written like this:
SELECT A_TABLE.NAME
FROM A_TABLE
WHERE A_TABLE.NUMBER BETWEEN 100 AND 103



SQL Server performance tuning can consume a considerable amount of time and effort. The following list is a quick guideline that you should keep in mind when designing and developing SQL Server database applications:

User Defined Functions (UDF)
Refrain from using user defined functions (UDF) in a select statement that may potentially return many records. UDFs are executed as many times as there are rows in a returned result. A query that returns 100,000 rows calls the UDF 100,000 times.

SQL Server table indexes
Create SQL statements that utilize defined table indexes. Using indexes minimizes the amount of table scan which in most cases will be much slower than an index scan.

Multiple disks
The single best performance increase on a SQL Server computer comes from spreading I/O among multiple drives. Adding memory is a close second. Having many smaller drives is better than having one large drive for SQL Server machines. Even though the seek time is faster in larger drives, you will still get a tremendous performance improvement by spreading files, tables, and logs among more than one drive.

Disk controllers
Different disk controllers and drivers use different amounts of CPU time to perform disk I/O. Efficient controllers and drivers use less time, leaving more processing time available for user applications and increasing overall throughput.

SQL Server foreign keys
Ensure that all your tables are linked with foreign keys. foreign keys enhance the performance of queries with joins. Database tables inside each application are naturally related. Islands of tables are rarely needed if your application's business logic is well defined.

SQL Server primary keys
Ensure that every table has a primary key. if you can't find a natural set of columns to serve as a primary key, create a new column and make it a primary key on the table.

Processor (CPU)
When you examine processor usage, consider the type of work the instance of SQL Server is performing. If SQL Server is performing a lot of calculations, such as queries involving aggregates or memory-bound queries that require no disk I/O, 100 percent of the processor's time can be used. If this causes the performance of other applications to suffer, try changing the workload of the queries with aggregates.

Are you doing excessive string concatenation in your stored procedure?
Where possible, avoid doing loads of string concatenation as it is not a fast process in SQL Server.

Have you checked the order of WHERE clauses when using AND?
If you have a WHERE clause that includes expressions connected by two or more AND operators, SQL Server will evaluate them from left to right in the order they are written (assuming that no parenthesis have been used to change the order of execution). You may want to consider one of the following when using AND:
•Locate the least likely true AND expression first. This way, if the AND expression is false, the clause will end immediately, saving time.
•If both parts of an AND expression are equally likely being false, put the least complex AND expression first. This way, if it is false, less work will have to be done to evaluate the expression.

Have you checked that you are using the most efficient operators?
Often you don't have much of a choice of which operator you use in your SQL statement. However, sometimes there is an alternative way to re-write your SQL statement to use a more efficient operator. Below is a list of operators in their order of performance (with the most efficient first).
•=
•>, >=, <, <=
•LIKE
•<>