DTS Packages Deployment Issues

Tech Stuff

1/24/2005

DTS Packages Deployment Issues




SQL SERVER 2000 DATA TRANSFORMATION SERVICES DTS
1.     CONNECTION PROPERTIES
2.     DATA PUMP TASKS
3.     EXECUTE SQL TASK
4.     DYNAMIC PROPERTIES TASK
5.     SCOPE FOR MINIMIZING THE CHANGES
6.     ONE PROBLEM STILL REMAINS ...
7.     SUMMARY


SQL Server 2000 Data Transformation Services DTS

The DTS Package has severe limitations with respect to moving the packages to another system. The changes that need to be carried out for successful deployment are enormous. It encompasses almost all the components of the package.
The typical changes required for the purpose are discussed below. These are categorized according to the nature of changes as well as the components.

1. Connection Properties

The connections used in the package either point to the SQL server database (Microsoft OLE DB Provider for SQL Server, Microsoft Access etc) or a file path (Text File, Microsoft Excel etc). These connections cause dependencies vis-à-vis server name, database or file path. So a classic case will require editing the connection properties according to the production environment. These may include editing almost all the connection properties of a connection, which in turn should be carried out on all the connections available in each package.

Click to Enlarge: Connection Properties

Fig. 1.1



Back to Top



2. Data pump Tasks:

These are the most vital components of a DTS Package. To add to the deployment woes, these are the tasks that are affected the most. It is a must to refresh all the parts of the task. This will typically involve opening all the tasks and changing each database object to point to the new database/server. It will mean almost total rework for the packages which have simple column-mapping rules/logic and data transformations.

Click to Enlarge: Data Pump Task   Click to Enlarge: Data Pump Task   Click to Enlarge: Data Pump Task

Fig. 2.1

Fig. 2.2

Fig. 2.3



Back to Top



3. Execute SQL Task:

The execute SQL Task only requires the connection name to perform the task. For this reason, you may not need to change anything in the task if the connection properties are updated without any change in the connection names. Exceptions may arise if the SQL query contains any hard coded parameter which has changed during deployment. e.g. If your SQL query contains hard coded server name, it has to be changed to the new server name in the production environment. Similarly for file name or file path.

Click to Enlarge: Execute SQL Task

Fig. 3.1



Back to Top



4. Dynamic Properties Task:

Though these are the developers’ real assets for minimizing deployment efforts, they themselves have critical dependency on the development environment. The dynamic properties task feed data to the package at run time thus minimizing the environment dependencies. But the data source they usually use are pretty much dependent on the system environment. Hence they need changes with respect to the location of the data source which usually is an INI or a udl file. At least they can boast of reducing the count of required changes to almost one or two.

Click to enlarge: Dynamic Properties Task   Click to enlarge: Dynamic Properties Task   Click to enlarge: Dynamic Properties Task

Fig. 4.1

Fig. 4.2

Fig. 4.3



Back to Top



5. Scope for minimizing the changes:

The efforts towards minimizing the transition constraints should basically be concentrated towards using the dynamic properties task wherever possible or feasible. To begin with, the dynamic properties task can help eliminate the constraints discussed in the point 1 i.e. Connection properties.
Let us see how this is achieved.

Create an INI file with connection properties details such as server name, database, user id and password etc. Now we will force the package to read the connection properties from this file. So whenever there is a change in the environment, all we need to do is to edit the INI file key values accordingly. This will ensure that all the package connections read the correct connection properties at run time.
That is, we are going to avoid the tedious job of editing connection properties for each package connections.
The dynamic properties task should be executed as the first step inside the package. This will ensure that all the properties are set correctly before attempting any task.

Back to Top



6. One Problem Still Remains ...

A data pump task is the one which needs addressing various complexities.
Even after all connection properties are updated the task will still be trying to locate the last executed source and destination objects. So if the new server does not have the same database name as used in development environment the package will fail to execute. This dependency is true for both source and destination objects.
This can be addressed again using the INI file. You have to ensure that the dynamic properties task also assigns source and destination object values for the data pump task correctly. For this purpose you can pass on the object names with the format: [DataBaseName].dbo.[TableName]. The data base name is already there in the iNI file and usually the table name is the same in any environment you are shifting your packages to. If it is not the case, you can read the new table name again from the INI file and construct the source and destination objects accordingly. This leaves you with only one limitation of the data pump task namely, transformations. These are very sensitive to the column names used in the development environment. If there is any discrepancy in the column names used in the mapping, the package will seize to work.

But this is one thing that is highly debatable.

The core of a DTS Package is the data transformation. If you are not going to retain this while moving between environment, it is as good as creating a new package.
If we remove the dependencies on the column mappings or the column mapping itself, we have just ended up creating a package template i.e. a skeleton of the package. Our intentions were just not that!

If you agree with this, we have come to the bugging superstar of DTS Package deployment. Yes it is no other than the dynamic properties task that has given us the enormous flexibility of assigning almost all properties at run time thus making our package environment independent. But the last and hence the most crucial dependency is created by this task itself. The values the package is going to read are stored in an INI file that is available at a given file path. This value is still hard coded and will fail the package if the new production system does not have exactly the same path as in development environment.


You can argue that it is the most trivial limitation and we can happily edit the INI file location details in exchange for freedom from editing almost every component of the package.
But hold on, the way we are using INI files, it is going to be present in almost all the packages we have built. Again editing the INI file location involves editing inside the package, a tendency we aim to avoid.
This problem is solved by using a VB application for installing the packages to the server. During installation you can specify the ini file that should be used to execute the dynamic properties task. This file path will be saved in the package for any further processing.

Back to Top



7. Summary

To summarise, we have eliminated the dependency on an environment in the above four steps.

Firstly, we moved connection critical values to the INI file which was used by the dynamic properties task to assign them to the available connections. This helped us eliminate the need to change the properties inside the packages. Also this reduced the effort required for this purpose to almost nil. Using this all the new connection properties will be made available in the INI file. That is, the new values need to be edited inside the INI file only. All the connections that are available with each package will automatically read new values from the INI. All this, in place of editing each package connection property individually! On an average, you can expect 2-3 connections in 1 package. If the whole ETL has, say, 10 packages, you will need to edit one set of connection properties as compared to the typical 20-25 connection sets. That too from outside the package! No one will be stumped!

In the second part, we shifted the dependencies again to the ini file. This step may require a few string manipulations to arrive at the required property values. Typically, appending the table name to the database name in the ini file to arrive at the source and destination object names the task should look for. Similarly for any other property except for the column mappings.
As discussed earlier, since the mappings and other transformation tasks form the core of the dts package and must be retained to preserve business or process flow logic. This will lead to only one, but most reasonable, constraint of keeping the structure of the involved tables or objects the same as the development database. This strictly means persisting with the same requirements (structure) for which the package was developed. You can have the liberty of adding extra columns to the table but should not alter the existing columns.
** We can still look at eliminating this constraint some time later, but this elimination would effectively mean developing an ETL template rather than a complete working package.

In a nutshell, you now have a DTS Package which will work without any modification in any environment. The only limitation being the existence of the same table structure for which it was developed.

In the third step, we were relieved to find the unlikelihood of any constraint as such. As the only thing that this task is looking for is the connection property name and not the server/database it is pointing to. So better try to avoid any unforced blunders like hard coding any thing in the SQL Query. A typical example can be one where you are generating a source file name based on the SQL Query. You may be tempted to hard code the file path at the beginning of the result parameter. Though it sounds easy, it will need an updation every time you change the server name or the location of the file. This must be avoided at all cost. Or else the whole exercise will appear futile if you are forced to edit SQL query from inside the package.

Lastly we saw the constraints dictated by the dynamic properties task. After this much success, we found that we still have to edit the INI file location based on the system path it is stored.
This problem cannot be solved using an INI again (for a change!).
One way out is to force the user to store the ini file at a specified location, without fail.
There’s no arguing the virtue of this way of enforcing static file paths. But we do have a more programmatic solution to this.

During installation of the packages the user will have to specify the path at which he is going to supply the ini file. This information will be saved with the packages on the server. Only if the user doesn’t specify the new path will the default path be saved.

Back to Top




Miles To Go