A Better Way to Clean Up the SSIS Catalog Database

Clean Up the SSIS CatalogEarlier this week, I blogged about the automatic cleanup process that that will clean up the SSIS catalog logging tables. This nightly process removes data for operations that are older than 365 days. While this is useful, many SSIS admins have complained that this process is very slow and contentious on large or busy SSISDB databases.

In this post, I’ll show to you one of the main reasons this purge process is slow, and will share a more efficient way of performing this delete operation.

Investigating the delete operation

When the catalog purge takes place, it removes the old operational and execution data from most of the log tables, including:

  • internal.operations
  • internal.executions
  • internal.executable_statistics
  • internal.event_messages
  • internal.event_message_context

… and many others. Some of these tables can grow quite large, but since the process runs each night, the percent of data being deleted is relatively small. So why does it take so long to run?

To investigate more deeply, we’ll open up the stored procedure [internal].[cleanup_server_retention_window], which is invoked each night via SQL Server Agent job. When looking through this stored procedure, one would expect to see a score of score operations to touch each one of these tables. However, what you’d find is substantially simpler. Below is an example of the centerpiece of the delete operation in the purge stored procedure.

image

Rather than twenty or more DELETE statements, we find just one, which deletes from the [internal].[operations] table.

Cascading deletes

As it turns out, deleting from the [internal].[operations] table will handle all of the delete operations, for one reason: All of the foreign keys are built using the setting ON DELETE CASCADE. When using this setting in a foreign key, you can delete a row in a table referenced by a foreign key, and the delete operation will also remove any rows in the table that references that particular row. In theory, this is a good idea: it helps to ensure referential integrity, and reduces the amount of code you have to write to delete data. However, using the cascading deletes can lead to lots of blocking, particularly when those deletes are chained together across multiple tables.

One example of this chaining is the [internal].[execution_parameter_values] table. To delete from this table, the SSIS purge process deletes the [internal].[operations] table. This table is referenced by a foreign key on the [internal].[executions] table, so the update transaction is now deleting from two tables by way of the ON DELETE CASCADE foreign key. Finally, the [internal].[execution_parameter_values] table has a foreign key referencing [internal].[executions], which means that we now have three different tables involved in a delete of a single row on the parent table. While this may not present an issue with small and less busy SSISDB databases, it is not uncommon to encounter contention issues with the purge process for very busy SSIS catalog databases.

A Better Way to Clean Up the SSIS Catalog Database

While I am a fan of writing less code when possible, I’ve learned that relying on the ON DELETE CASCADE constraint of the foreign key is not the most efficient way to handle purge operations from the SSIS catalog. To that end, I’ve created my own deletion script that touches each table individually to purge old data. This script creates a working list of operation IDs in a temp table, which will be used as a reference to identify the rows to be deleted in each log table.

Using that list of IDs, this script will then process the essential logging tables in reverse order of granularity (smallest to largest) as defined by the foreign keys on those tables. The order of these delete operations is important to ensure an organic delete rather than relying on the ON DELETE CASCADE setting of the foreign keys.

The script below is a sample of what I use to perform the cleanup operation on the SSISDB catalog database. As you can see, it handles each table individually, using a list of execution_id values built by querying a list of old executions based on the retention settings in the SSIS catalog. As you’ll see, a few of the tables have special handling to delete data in batches of 1 million rows, due to the amount of data that can accumulate in those tables.

How to use this script

To use this script, I recommend that you create a separate user-defined stored procedure in the SSISDB database. Although you could simply replace the code in the internal.cleanup_server_retention_window stored procedure, it is a best practice to avoid modifying any of the existing DDL in the SSISDB database. It is much safer to create your own stored procedure for this, preferably in a new schema that you also create to build some logical and functional separation from the built-in SSIS catalog objects. You can then create your own SQL Server Agent job that executes this stored procedure.

Conclusion

While the built-in automated purge process in the SSISDB is fine to use for average workloads, one can encounter performance and contention issues on large and/or busy SSISDB databases. Using a more direct, table-by-table approach requires more code, but this custom process to clean up the SSIS catalog database will perform better and provide more flexibility.

About the Author

Tim Mitchell
Tim Mitchell is a data architect and consultant who specializes in getting rid of data pain points. Need help with data warehousing, ETL, reporting, or training? If so, contact Tim for a no-obligation 30-minute chat.

10 Comments on "A Better Way to Clean Up the SSIS Catalog Database"

  1. Very cool. I have often fought with the SSISDB Catalog cleanup processes. But have never dove deep enough to work it out this way.

    One thing to note … I may be mistaken, but in the sections of your script where you want to delete in batches, it appears that you left out the TOP (#####) from the DELETE statement.

    Otherwise, well done.

    -JeffRo

  2. Ah.. I see. SET ROWCOUNT 1000000. My bad. 🙂

    Very elegant. 🙂

  3. thesmilingdba | January 3, 2019 at 3:38 pm | Reply

    Wow, that is faster than other method. Thanks Tim!!!

  4. Nice script, thanks

    I have a system with 100 packages with execution every 5 min, my SSISDB where 300 GB after delete, I have to DROP Keys and Certificates like “cleanup_server_retention_window” do to free up space.

    This action is slow, we have 33.000.000 rows in sys.symmetric_keys and sys.certificates and one drop take about 30 sec. it will take 31 years to clean it up 🙂

    This will help the script to prevent this issue.

    DECLARE @execution_id bigint = (select min(execution_id) from #executions)
    DECLARE @execution_id_max bigint = (select max(execution_id) from #executions)
    DECLARE @sqlString nvarchar(1024)
    DECLARE @key_name [internal].[adt_name]
    DECLARE @certificate_name [internal].[adt_name]
    DECLARE @message varchar(100)

    WHILE @execution_id <= @execution_id_max
    BEGIN
    SET @message = CAST(@execution_id as varchar(100))
    RAISERROR( @message,10,1) with nowait
    SET @key_name = 'MS_Enckey_Exec_'+CONVERT(varchar,@execution_id)
    SET @certificate_name = 'MS_Cert_Exec_'+CONVERT(varchar,@execution_id)
    SET @sqlString = 'IF EXISTS (SELECT name FROM sys.symmetric_keys WHERE name = ''' + @key_name +''') '
    +'DROP SYMMETRIC KEY '+ @key_name
    EXECUTE sp_executesql @sqlString
    SET @sqlString = 'IF EXISTS (select name from sys.certificates WHERE name = ''' + @certificate_name +''') '
    +'DROP CERTIFICATE '+ @certificate_name
    EXECUTE sp_executesql @sqlString
    SET @execution_id += 1
    END

  5. Thanks Tim. Your script got us out of a hole with our SSISDB that the regular Stored Procedure could not (it was generating so many log entries that it was unable to complete the procedure).

    We’ll be adopting it as a normal overnight routine.

    Top work!

  6. David Rawheiser | June 4, 2021 at 4:31 pm | Reply

    Thank You, Thank You, Thank You. – I did make a little change, in that the SELECT INTO holds a schema lock, so I explicitly create the table. before inserting the catalog.executions. Which brings me to a question – how does that table get cleaned up – nowhere is there delete for catalog.executions. are we relying on the cascading deletes to remove it?

    • David, that’s good feedback on explicitly creating that table. Regarding catalog.executions, this is actually a view, not a table. The script deletes data from the underlying table (internal.executions) directly.

  7. Marty Scherr | June 1, 2023 at 3:51 pm | Reply

    Hey Tim, partially based of your post here which by the way is awesome. I came up with my own solution to handle pruning data from my SSISDB Database.

    Here’s my git link to my solution: https://github.com/martyscherr/SSISDB_Cleanup I have included a README.md file explaining my solution. My main reason I don’t use the built-in data pruning options that Microsoft provides is that I needed to control the row counts when I was deleting data from the SSISDB tables. Also I wanted to be some logging of the process and I used a TRY/CATCH exception block and explicit transactions to control the process. If you get a chance, please take a look at my solution and send me any feedback. As a disclaimer: This process is by no means 100% bug free, but I felt I made a decent effort to create a solution that would have minimal impact to my very busy SSIS environment but prune the SSISDB tables on a daily basis. I choose to only keep 3 days of data based on the the main table in the SSISDB data structure: SSISDB.internal.operations. I prune data older than 3 days based on the created_time column of that table. I have documented the Referential Integrity of the SSISDB tables as that played a factor in the performance of the pruning process. When I analyzed the SSISDB Table structures I was reminded that the SSISDB has foreign key constraints on many of the tables and utilizes the ON DELETE CASCADE clause on the Foreign Key statement. I found that if I didn’t drop the Foreign Keys before I ran my pruning process it was not very efficient. So I choose to drop the Foreign Keys during the process and create them again when the process completes. I have SQL Scripts to validate that the Referential Integrity after the pruning process to confirm that Referential Integrity was maintained. If you have time, please check out my solution and give me any feedback you may have. Thanks for your time.

  8. Lifesaver Tim – thanks so much for this script !

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.