Data Transformation Services Performance Tuning Tips

Tech Stuff

1/24/2005

SQL Server Data Transformation Services Performance Tuning Tips

Data Transformation Services Performance Tuning Tips


When using DTS, try to use the native OLE DB provider instead of the ODBC provider when importing and exporting data, it is faster.

*****


ActiveX scripts slow down data importation and transformation when using DTS packages, so if you can avoid ActiveX scripts when creating DTS packages, do so.

*****


If you choose to use ActiveX scripts in your DTS packages, choose from the various scripting options in this order, from fastest to slowest: VBScript, JScript and then PerlScript.

*****


To speed up your ActiveX scripts in DTS packages, refer to columns using their ordinal number instead of their name.

*****


The DTS data pump is slowed down when mapping a separate transformation function for each column. If possible, avoid mapping columns when moving data from one table to another.

*****


DTS Lookups slow down performance. Instead, try to use a Transact-SQL statement instead to perform the same function within your DTS package. In addition, avoid using global variables or COM objects for performing lookup type functions, as they are even slower than using a DTS lookup.

*****


To speed the performance of DTS packages, try to create your steps so that they can run in parallel, instead of sequentially. This allows the various steps to run simultaneously and complete their work faster. To ensure DTS steps run in parallel, don't configure them with any precedence constraints.

*****


If you will be transforming data within your DTS package, one of the options available to you is the "Fetch Buffer Size". This option is available from the "Advanced" tab of the "Data Transformation Properties" screen in SQL Server 7.0, or from the "Option" tab of the "Transform Data Task Properties" screen in SQL Server 2000. This option is used to specify the number of rows to fetch in a single operation from the specified OLE DB data source. The default value is 100.
Generally, you do not need to change this value, but in some cases it can be beneficial. For example, if the data you are retrieving from the data source is sequential, such as in primary key or clustered index order, then increasing this value can reduce disk I/O, helping to boost performance. While I/O is decreased, SQL Server will have to use more memory to hold the larger chunks of data. Assuming I/O is more of a bottleneck on your server than memory, this can be a beneficial tradeoff. Unfortunately, there is no easy way to identify what the ideal value should be for the "Fetch Buffer Size". You will have to test yourself to see if changing this value is beneficial to you.
If you run a job often, and the job is long because of the amount of data that has to be moved and transformed, then experimenting with this option may be worthwhile. But if the job runs rarely, or the amount of data involved is minimal, then leaving this option set to the default value is your best bet.

*****


Be default, whenever a DTS package is opened, DTS has to read the registry to see if there are any new OLE DB data sources or tasks. This process can be "relatively" time consuming. If you know that you are not adding or changing OLE DB data sources or tasks, then you can tell DTS not to read the registry each time. Instead, what happens is that the information is read once, then this information is cached and reused each time the DTS package is opened. To turn on this feature (which is turned off be default), right-click on the "Data Transaction Services" group and select "Properties." Then in the "Package Properties" windows, select the "Turn on cache" option. If you find that any OLE DB data sources or tasks have changes after you have turned caching on, you can click on the "Refresh Cache" option on this same window to update the cache.

*****


Assuming you don't need to transform data during a DTS import in SQL Server 2000, the Bulk Insert task provides the fastest data loads into SQL Server. Not only does the Bulk Insert task not allow any transformation, it also does not permit changes in column mappings, or to import data into any other database other than SQL Server, or to create an exception log file for bad records.

*****


By default, when you use the Transform Data task in SQL Server 2000 DTS, each column to be transformed from one data source to another is assigned its own COM object. Because of this, the transformation process can be slowed as the COM objects are opened and closed. A more efficient, faster method would be to use only one COM object, not multiple COM objects. Fortunately, this is relatively easy to do.
To modify the Transform Data task from multiple COM objects into one, follow these steps:

• From the "Transformations" tab of a Transform Data task object that has already been assigned data sources, click on the "Delete All" button to remove all of the current transformations. This is the step that deletes all of the COM objects.
• Now click the "New" button, and select "Copy Column" from the "Create New Transformation" dialog box, then click on "OK." This is the step that create a single, new COM object that can be used for your transformation.
You will find that performing these two simple steps can boost the performance of the transformation in your DTS package.

*****


When creating new transformations using the SQL Server 2000 DTS Transform Data task, you are given nine different ways in order to transform data. For best performance, assuming it will do the job you need, select the "Copy Column" option. If what you need to do cannot be done using "Copy Column", the next fastest option is "ActiveX script".

*****


The Data Pump Task is faster than a Data Driven Query within a DTS package if there is a one-to-one mapping of the columns and no transformations are involved when moving data between tables. But if there are transformations involved, then a Data Driven Query will offer better performance.


Miles To Go