I’m kicking off a new series of blog posts discussing the topic of parent-child architectures in SQL Server Integration Services. The links to the posts in this series are below.
- Parent-Child SSIS Architecture
- SSIS Parent-Child Architecture in Package Deployment Mode
- SSIS Parent-Child Architecture in Catalog Deployment Mode
I still remember the first SSIS package I ever deployed to a production environment. It was the mid-2000s, and I was working on a large data migration project which would take my then-employer, an acute care hospital, from an old UNIX-based system into a modern SQL Server-based OLTP back end. The entire solution, which pushed around a few hundred million rows of data, was completely contained in a single SSIS package. And this thing was HUGE. When I say huge, I mean that the package metadata alone was 5mb in size. I had a bunch of duplicate code in there, and when I opened or modified the package, it took sometimes a minute or more to go through the validation for the dozens of different tasks and data flows. In hindsight, I can admit that it was not a well-designed architecture.
Fast forward about a decade. Having learned some lessons – the hard way – about ETL architecture, I’ve relied on a completely different way of thinking. Rather than relying on a few, do-everything SSIS packages, I prefer to break out tasks into smaller units of work. In using more packages that each do just one thing, I’ve discovered that this architecture is:
- Easier to understand
- Simpler to debug
- Much easier to distribute tasks to multiple developers
- In some cases, better performing
As part of my role as an independent consultant, I also do quite a bit of training, and in those training sessions the topic of parent-child ETL architecture comes up quite often. How many packages should I have? Should we have lots of small SSIS packages, or fewer, larger packages? This is also a topic on which I find a lot of questions on SQL Server discussion forums as well.
To share my experience on this topic, I’m starting a new series of post discussing parent-child architectures in SSIS. As part of this series, I will cover:
- Overview of parent-child architecture in SSIS
- Parent-child architecture in package deployment mode
- Parent-child architecture in catalog deployments
I’m looking forward to writing this series over the next few months. As always I look forward to your feedback.
Look forward to reading this article TIm. I have similar experience of creating a HUGE SSIS package (in my case for calculating sales commission payments) that was a nightmare to maintain. I ending up re-writing the whole thing and breaking into smaller bite-sized packages. Each package was controlled by a parent control package that would send and receive parameters from its child packages. Coupled with the auditing that I also added, it proved far easier to work out what was happening at any point in time,
Mike, that’s a common evolution from what I’ve learned in talking to others. Instinct is to drop everything into just one package, but as one matures as an SSIS developer, the benefits of a multilayer architecture become more clear.
Tim, I’m interested in your series of articles and can’t wait till some of the later postings. I have already built a framework using the Parent-Child structure in 2008 R2. I’ll be interested in the differences with 2012 as we will be migrating to that soon.
One of the things I have not been able to figure out is how to run several packages in parallel while running them in a managed fashion from my framework.
Any ideas you have in that space will be gratefully received.
Hi Nick – that’s a great question. In previous posts, I’ve talked briefly about a work pile pattern, in which I create a list of packages to be executed and then programmatically retrieve packages from that list. I’ll touch briefly on this again later in the Parent-Child series.
Is this blog series still going? I’d love to read the rest!!
Hi Andrew,
Absolutely! My travel schedule got the better of me this summer, but I’m working on another post that discusses the project deployment mode as well. Hope to have it out soon!