MySQL Needs to Validate Replicated Databases

by Charles Iliya Krempeaux, published on Wed Jun 7th, 2006

Since replication has become available in MySQL, it has become an essential tool in creating scalable systems.

However, it does have one shortcoming (that may or may not be shared with other database servers). At no time that I am aware of does MySQL check if each copy of the database is the same.

What MySQL needs is a way of validating each copy of the database. What MySQL needs is a way to check that they are actually all the same, and have not gotten corrupted, damaged, or out of sync somehow.

What is MySQL

MySQL is an extemely popular Free and Open Source relations database.

Much of the early rapid development of the Web can be (partially) accredited to the existance of MySQL. (Although if MySQL had not have existed, then something else, such as PostgreSQL, would have likely took its place.)

Before MySQL the monetary cost of relational databases were out of the range of most people. This hindered the learning of and development with relational databases. MySQL made relational databases accessible to everyone. No longer did high cost stop people from learning or using relational databases.

MySQL's history is closely tied to that of Linux, PHP, and Apache.

Today, MySQL is an enterprise grade database server. Powering much of the infrastructure of today information-based world.

Replication

Replication is a technique used in creating a distributed database. A distributed database is essentially one in which there are mutiple copies of the same database.

People often want multiple copies of the same database for scalability reasons that come from needs for high performance and high availability.

Today, with MySQL database engines such as MyISAM and InnoDB, replication is accomplished by setting up one copy of the database as the master. This master is then the authoritative copy of the database. The original.

All other copies are then called slaves and directly or indirectly get the database from this single master.

One of the ramifications of the nature of this technique is that all mutable SQL commands MUST be issued to the master.

And under no circumstance SHOULD a mutable SQL command be issued to a slave.

Of course, immutable SQL commands CAN and SHOULD be issued to slaves too. (Immutable SQL commands include SELECTs, DESCRIBEs, SHOWs, etc.) This is one of the ways high performance needs are met with a distributed database such as the one I have been describing.

Need for Validation

One of the problems with the current replication technology in MySQL is that all it really does is pass on the mutable SQL commands from the master to the slaves. It passes on INSERTs, UPDATEs, CREATE TABLEs, and other mutable SQL commands.

The problem with this is at no time does it check that what is on the slaves is in sync with the master. This technique for repication implicitly assumes that none of the slaves will suffer from corruption or damage and get out of sync. This implicitly assumes that nothing will ever go wrong.

This is a costly and naive assumption. In practice things do go wrong.

All too often I've seen MySQL replication slaves get out of sync.

Sometimes these database problems seem to be caused by hardware problems. Sometimes these database problems seem to be caused by scalability problems. And something these database problems seem to be caused by a developer accidentally connecting to a slave (instead of the master) and issuing mutable SQL commands.

How to Validate

Really, validation shouldn't be that difficult to implement.

Perhaps, on the master and on each slave, a message digest or cryptographic fingerprint could be calculated. Perhaps it could be calculated on each table. (Or on some other quantum.) Possibly using the MD5 or SHA-1 algorithms. Or a similar algorithm that hasn't been found to have critical security flaws. (Or an algorithm that supports rolling caculations.) This message digest or cryptographic fingerprint could then be checked. If a slave has a message digest or cryptographic fingerprint that does not match the master, then it is out of sync.

The database server could then try to fix the problem somehow. Although it could be configured not to attempt such a fix, and instead just alert the appropriate people (like the development team) using some method (like e-mail, VoIP, or some other messaging system).

If this were to get implemented into MySQL, I'm sure many many MySQL users would be grateful.

--


Comments

No known comments. (There may be some out there though.)


New Comments

Want to write a comment to this post on your own blog? Then use the HTML code below to link to this article....

Or better yet, use the quote-o-matic below by "selecting" the part of the text (in the article) that you want to quote, and then use the HTML code that will get generated below to link to this article....


Preview: