Use Cases for MariaDB Data Versioning

Working in software development, versioning of code is something that we’ve often taken for granted. Task definitions and bug descriptions are preferably also managed by a system that versions every change. On top of this, we use a lot of documents for designing, documenting and managing our development cycles. For example, some of the tools we use are Jira, Google Docs and our own Knowledge Base to accomplish these things, which all provide versioning support. In MariaDB Server 10.3, we’ve introduced an elegant and easy way for data versioning, called System-Versioned Tables. Let’s look at what it can be used for.

A Look Back at Data for GDPR and PCI DSS Compliance

The General Data Protection Regulation (GDPR) is now enforced by the European Union (EU). All companies collecting user data in the EU have to comply with the GDPR rules. In addition to the daily questions and statements coming over email asking you to agree to new terms because of GDPR, the companies also have to store personal and private data in a way that fulfills the criteria of GDPR.

Card payments also have their own rules. There are standards like the Payment Card Industry Data Security Standard (PCI DSS), which are followed by banks and other online businesses. 1) What happened when, 2) by whom and 3) what did the data look like before and after? In MariaDB Server, the MariaDB Audit plugin is there for dealing with 1) and 2). It can also be used for 3) by looking in the audit logs on changes made, but it doesn’t give you the full data for how it looked before and after. With the newly released System-Versioned Tables this is possible. Let’s say that payment card information is stored in a database table. By turning on versioning for that table, all changes will create a new version of the row(s) affected by the change. The rows will also be time stamped, which means that you can query the row to see what it looked like before.

Handling Personal Data in Registries

When you think about versioning, one thing that comes to mind are registries of all sorts, which is the domain of GDPR when it comes to handling personal data. There are many types of personal data and one important to all of us is healthcare data, for example the patient registries of hospitals. In these registries versioning is of great importance to keep track of patients’ health history and related information such as medication. Other personal data registers are civil registers, tax registers, school and student registers and employee registers. The list is endless.

Rapidly Changing Data Sets

All of the above mentioned examples with data versioning applied in one way or another can be seen as slowly changing data. What I mean is that, although the systems can be huge and the total amount of transactions happening enormous, each piece of data doesn’t change that often. For example, my information in the civil register doesn’t change every second. But what if we have rapidly changing data such as the share rates at a stock exchange or tracking vehicle data for a shipping company. In these cases, we can make use of MariaDB’s data versioning.

Creating applications or software for the above purposes and having a database that provides data versioning out-of-the-box will lead to easier design, less customization and a more secure solutions.

Step-by-step Example

I’ll end with a GDPR example. I have a newsletter with subscribers and want to make sure that I always know when and what has happened to the subscriber data. I create a table in the database for the purpose and turn on versioning for the table.

CREATE TABLE Subscriber (
SubscriberId int(11) NOT NULL AUTO_INCREMENT,
FirstName varchar(50) NOT NULL,
LastName varchar(50) NOT NULL,
Newsletter bit NOT NULL,
PRIMARY KEY (SubscriberId)
) ENGINE=InnoDB WITH SYSTEM VERSIONING;

I insert myself as subscriber.

INSERT INTO Subscriber (FirstName, LastName, Newsletter) VALUES ('Rasmus', 'Johansson', 1);

I then try to add a column to the table.

ALTER TABLE Subscriber ADD COLUMN Gender char(1) NULL;
ERROR 4119 (HY000): Not allowed for system-versioned `Company`.`Subscriber`. Change @@system_versioning_alter_history to proceed with ALTER.

It results in the above error, because changing a versioned table is not permitted by default. I turn on the possibility to change the table and then the ALTER succeeds.

SET @@system_versioning_alter_history = 1;
ALTER TABLE Subscriber ADD COLUMN Gender char(1) NULL;
Query OK, 1 row affected (0.17 sec)

I also want a constraint on the new column.

ALTER TABLE Subscriber ADD CONSTRAINT con_gender CHECK (Gender in ('f','m'));

Then I do a couple of updates in the table.

UPDATE Subscriber SET Newsletter = 0 WHERE SubscriberId = 1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0

UPDATE Subscriber SET Gender = 'm' WHERE SubscriberId = 1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0

Finally, I delete the row in the table:

DELETE FROM Subscriber WHERE SubscriberId = 1;

If we ask for the rows in the table, including the old versions we get the following.

SELECT *, ROW_START, ROW_END FROM Subscriber FOR SYSTEM_TIME ALL;
+--------------+-----------+-----------+------------+--------+----------------------------+----------------------------+
| SubscriberId | FirstName | LastName | Newsletter | Gender | ROW_START | ROW_END |
+--------------+-----------+-----------+------------+--------+----------------------------+----------------------------+
| 1 | Rasmus | Johansson | # | NULL | 2018-06-08 10:57:36.982721 | 2018-06-08 11:14:07.654996 |
| 1 | Rasmus | Johansson | | NULL | 2018-06-08 11:14:07.654996 | 2018-06-08 11:15:05.971761 |
| 1 | Rasmus | Johansson | | NULL | 2018-06-08 11:15:05.971761 | 2018-06-08 11:15:28.459109 |
| 1 | Rasmus | Johansson | | m | 2018-06-08 11:15:28.459109 | 2038-01-19 03:14:07.999999 |
+--------------+-----------+-----------+------------+--------+----------------------------+----------------------------+

 

Even though I deleted the row, I get four old versions of the row. All this was handled by the database. The only thing I had to do was to turn on versioning for the table. What will you do with MariaDB’s System Versioned-Tables? We’d love to hear from you!