Temporal data storage, which was defined in the ANSI SQL:2011 standard, describes an architecture in which relational databases maintain row-level versioned copies of data. When using temporal data, the database does automatically what many DBAs and application developers have had to do manually for years to track point-in-time versioning for individual tables. Starting in 2016, Microsoft added SQL Server temporal tables to their data platform, which moved time-based row versioning into the database engine as a native feature.
In this post, I’ll share an introduction to SQL Server temporal tables.
Introduction to SQL Server Temporal Tables
Manually creating a time-based row versioning system is challenging. Designing such a solution that is clear to both technologists and data consumers is difficult, and until recently there has been no universally accepted design pattern for this. With the inclusion of temporal tables in SQL Server, those of us who create data solutions now have an easy-to-use native option for storing point-in-time data.
For those who have never had to store temporal data, think of this design as a time machine to allow you to interrogate the table as it existed at a previous time. Here’s an example: Let’s say you’re in the paper business and you have an extensive product catalog. The list of products you sell will always be a moving target; items will be added to and removed from that list constantly, with lots of updates along the way. For operational and reporting purposes, you’ll need to store the current product catalog as well as its historical changes. Your data solution must easily answer the question, “What did our list of products look like on a given date?” Temporal tables are an ideal solution for this.
How It Works
SQL Server temporal tables make this process fairly easy and mostly transparent. This feature uses two tables – one for live data, the other for historical versions – to store current and prior row versions. Temporal tables are created on a table-by-table basis, which means that you can choose only the tables you want to include in this versioning architecture without impacting conventional (non-versioned) tables.
When designing a table that will be used as a temporal table, three things are required:
- A primary key
- A non-nullable DATETIME2 column to store the starting effective date
- A non-nullable DATETIME2 column to store the ending effective date
When creating the temporal table, it is recommended to create the effective date columns as HIDDEN. This is a new feature specific to temporal tables which allows the table to be hidden from conventional use (such as SELECT *) but allow them to be directly accessible by name. I’ll demonstrate how to do this below.
Creating a Temporal Table
Temporal tables can be created within SQL Server Management Studio through the right-click menu. As shown below using SSMS 17.2 (the latest version as of this writing), there are new context menu extensions for creating temporal tables.
However, don’t expect this to look like the typical table designer UI. The process to create a new temporal table is done purely in T-SQL, and a table created as a temporal table cannot be edited using the table designer. The script generated from this series of clicks can be a bit confusing, so I’ve broken down the steps needed to create the temporal table below.
As shown above, the T-SQL statement to create this table is like that of any other table with a couple of exceptions. First, you’ll notice that the ValidFrom and ValidTo columns are created with the GENERATED ALWAYS AS ROW {START | END} HIDDEN syntax. This directive does three things:
- It sets these columns up as placeholders to store the start and end date of each row
- It marks the columns as read-only
- It suppresses these columns from SELECT * FROM <table name> queries
The PERIOD FOR SYSTEM TIME directive sets up the ValidFrom and ValidTo columns as the timekeepers for each row. Using this constraint, the ValidFrom and ValidTo columns will automatically be set whenever a DML operation takes place on that table.
In the above example, the table has been created with the necessary metadata, but it is not yet set up as a system-versioned table. I’ll do that now.
Set up System Versioning
Once the candidate table has been set up, you can enable system versioning on it as shown below.
The ALTER TABLE command above enables system versioning on the dbo.Customers table, specifying the name of the underlying history table as well. The name of the history table is optional, but I recommend defining your own table name to prevent SQL Server from creating an arbitrary name for you. Also worth noting is that you can specify an existing table as a history table (though I rarely do this – it’s just easier to set the name and let the ALTER TABLE statement create it).
Once you’ve run this ALTER TABLE, the table looks differently in SQL Server Management Studio. The table list UI clearly indicates that this is a system-versioned table.
Querying the Temporal Table
Writing to or deleting from the system-versioned temporal table is relatively straightforward. INSERT, UPDATE, and DELETE statements can be written just like any other DML operation, without any special syntax for the row versioning. Because SQL Server transparently manages the versioning, any queries to modify or retrieve data will be written against the original base table (in this case, [dbo].[Customers]). The only limitation is that you cannot update the system time columns (ValidFrom and ValidTo).
Querying data using SELECT statements is straightforward as well. To select the data that is current as of right now, a simple SELECT FROM query will suffice. When performing such a query, the table will show no evidence of being a system-versioned table, instead just showing the rows that are current as of now.
To retrieve the data that was current as of a particular time in the past requires a tiny bit more complexity. To select the table for a time in the past, use the syntax as shown below.
The syntax FOR SYSTEM_TIME AS OF {Datetime2} retrieves the rows that were valid as of that point in time. There are other options, including specifying a time range or a list of DATETIME2 values, but using FOR SYSTEM TIME AS OF {Datetime2} is the most common use of this data.
Note that the time used for row versioning is always stored in UTC. To query for a specific point in time based on your local time zone, you’ll have to do the offset calculation in code.
Using the query above, you can retrieve the state of the table as it existed at the specified date and time.
Final Thoughts
Both the base table and the history table (in this example, [dbo].[Customers] and [dbo].[CustomersHistory], respectively) are queryable. You can even access the SYSTEM_TIME columns directly by name even though they don’t appear when you query the base table using SELECT *. Remember that the history table is read-only, so you won’t be able to modify that data (by design!).
For situations when time-based row versioning is required, using SQL Server temporal tables can help avoid a lot of manual work. If you’re using SQL Server 2016 or newer, consider temporal tables as a good option for row versioning. In my next post on this topic, I’ll discuss the use of SQL Server temporal tables as slowly changing dimension tables in data warehouse architecture.
Conclusion
In this introduction to SQL Server temporal tables, I’ve described what temporal tables are, how to set them up, and how to retrieve data from them. For more information on temporal tables in SQL Server, visit my landing page on this topic.
Leave a Reply