I know that it is hard to find proper definition of SSIS (SQL Server Integration Services), but let’s simply say that it’s the tool that enables you to move data from point A to point B. Usually, point A is some kind of data source, be it SQL Server or other database, flat file, etc.). Point B refers to data destination (again, it can be table in the database, Excel or flat file, etc.).
On its way from source to destination, data can be (and most often is) transformed. For data transformation, you have whole bunch of different tasks at your disposal. Some of them are considered as blocking (such as Sort, Aggregate…) and you should be aware of that. Simply said, these components are executed in asynchronous way, which means that your data flow has to wait for them to complete before proceeding further.
Using Sort component
When you are dealing with Merge or Merge Join components, data that comes from data source(s) must be sorted. It’s easy to pull the trigger and drag and drop Sort task to your data flow. But…You should be aware that Sort task is fully blocking, and it requires that all other tasks in your package wait for this one to complete. It’s not a big issue when you are dealing with few thousands of rows, but consider moving millions of rows! Your data flow and your whole package consequentially, could be deadlocked for hours.
What should I do?!
Solution is pretty straightforward: avoid Sort task! You should sort data on the source side, using ORDER BY clause. Of course, you must “inform” SSIS that your source data comes already sorted, and you are doing this in following way:
Right click on the source task in your data flow and select Show Advanced Editor.
Go to Input and Output Properties tab.
Click on OLE DB Source Output and make sure that IsSorted property is set to True.
Finally, you need to “tell” SSIS about your sorted columns (which columns are sorted and in which sequence).
That’s it! This should speed up your data movement from point A to point B, especially if you are dealing with significant volume of records.
Last Updated on January 10, 2020 by Nikola