So you’ve got some packages that regularly extract data to one or more text files, but you know that from time to time some of the queries will not return any data. However, you find in SSIS that, in a flat file export package, the output file is created regardless of whether any rows are written to the file, and in cases where there are no rows returned from the source, you’ll end up with empty data files in your output directory. Although not a critical failure, having empty output files can be a nuisance, and unfortunately, there aren’t any native settings in the Flat File Destination or the Flat File Connection Manager objects that will clean up empty files. Fortunately, SSIS does provide a combination of other tools that you can use to emulate this behavior.
In this post, I will demonstrate how to use native Integration Services tools for eliminating empty output files in SSIS.
Eliminating Empty Output Files in SSIS
In our example, I’m going to create a package to extract data from a table using a query for which I know that no rows would be returned. When I connect this data source to a flat file destination and execute the package, I’ll see an empty data file in my export directory. Next, to demonstrate the intended behavior, I’ll add a Row Count transformation to store the number of affected rows in a variable, and create a File System Task object to delete the output file. Finally, I’ll use an expression constraint to only delete the file if the variable attached to the row count is zero.
First, let’s create a test table to query:
Now, I’ll create a basic package that will export to text the invoices for the past 30 days, a reasonable business requirement. Since we don’t currently have any invoices matching that criteria, we’ll end up with an empty output file. The original package is shown below:
Now for the new-and-improved version, I’m going to drop a Row Count transformation into the data flow to save the number of affected rows to a variable:
Finally, I’ll create a File System Task and configure it to delete the output file. To ensure that a valid data file is not deleted, I’ll create a precedence constraint using an expression to only execute the delete if the row count variable is equal to 0:
When you execute this package, you’ll see that the File System Task object is executed because there are no rows matching our query. You can test the package by inserting another row into the database that will be returned by the query, and you’ll see that the data file is exported but not deleted.
Conclusion
In this brief post, I have demonstrated eliminating empty output files in SSIS can be a relatively easy operation. The code for this example can be downloaded here.
This helped me much when setting up process for mailing sql-based reports 🙂
Excellent! Glad it was useful for you.
Another instance of a ‘how to’ on this site that is exactly what I need. Tim Mitchell provides such valuable information, best SSIS resource there is…
Thanks Bill! I’m glad you found it useful.
Hi Tim, I’m interested to see the detail behind this but the doc you posted is resulting in a ‘page not found’ message. Can you point me in the right direction please?
Hi Jan – I corrected the bad link. You should be able to download now. Thanks!
The link to the zip file is no longer valid. Can you update or provide a new link?!?! Thank you!
Hi Chazz – I corrected the bad link. You should be able to download now. Thanks!
Hey, Tim.
I’m trying to do this with a file that has datetime appended to the end, e.g. “output 2016-09-14 09-10-49.txt”. The problem being that the expression used to generate that string is evaluated once when the file is created and again upon the file system task – meaning if there is any delay, then the files could have different timestamps: file created “output 2016-09-14 09-10-49.txt” then the file system task looks for “output 2016-09-14 09-10-51.txt” and it will not be found to be deleted. This is probably out of the scope of the post and has more to do with assigning variables at run-time, but figured I’d ask.
Hi Jared, I’ve run into the same issue before. The way I solve this problem is to capture the date/time stamp at the outset of the load (either with a script task or using a T-SQL statement) and write that value into an SSIS variable. That variable can then be used as the basis for creating the date/time string.
Great info! I tried to apply this trick but the problem in my case is that the data flow task is inside a foreach loop. This means that somehow the RowCount variable gets overwritten right until the last foreach input variable and only works with the last loop variable. I searched for how to use row count inside a foreach loop but there is very little information. Do you have an idea?
Roland, you could try changing the scope of the variable, or use an Expression Task (if you’re using SSIS 2012 or newer) to manually reset the row count for each iteration of the loop.