How to break the monolithic database

Monolithic applications are increasingly being replaced with smaller, more easily maintainable, components or services that communicate with each other to accomplish the same task in a more efficient manner.

As services are separated from each other, it becomes easier to identify distinct services that are highly utilized and scale those services independently of the whole application.

Storing data from different domains in a single database has an additional side effect that should not be underestimated. It creates dependencies between different services/domains on a DB schema level. A breaking change on the schema level – e.g. changing the data format of a column — will have a ripple effect through the whole system, ultimately requiring a synchronous change and production deployment of all dependent subsystems. Finding these subsystems, however, is almost impossible as the database often doesn’t provide a reliable way to track those dependencies. Again, this significantly impacts the new feature development as it makes broader and high quality refactoring impossible.

Data isolation within the existing database

There is actually a way to approach the split safely using only master-slave replication. The idea is to split/isolate the data within the originating database itself (Main DB). This can be achieved by restricting the MySQL database users.

For example, the Services “A” data had to be split out of Main DB. So a Services “A” DB user was created and this DB user was restricted to read/write only to the Services “A” data tables. Similarly, all other DB users (one by one) had their access to the Services “A” data tables denied.

This is a long process that includes extensive monitoring for system abnormalities, because when a DB user (e.g. Services “B” user or Services “C” user) has the access cut, then the service that is using this user might suffer from lack of privileges in case it directly or indirectly (though joins) uses Services “A” tables.

If abnormalities are detected the user should have their access immediately restored. The process, however, helps with identifying and fixing of rogue services which directly use the Services “A” data tables. Needless to say, the service that is going to have its user access cut from Services “A” tables should have it’s code base thoroughly examined prior to the exercise. In summary, the process will be:

  1. List all database users.
  2. Identify the user that with high confidence doesn’t use directly the Services “A” tables
  3. Cut the access to Services “A” tables to that user.
  4. Carefully monitor the system for abnormalities.
  5. If an abnormality is found immediately restore the DB user access, investigate and fix the reason and go to step 3.
  6. Successfully eliminated the access of the selected user to the Services “A” tables. Go to step 2. for the next user until all users except the
  7. Services “A” user has their access denied.

In the end, other users have their access to Services “A” tables cut, we can safely say that the Services “A” data is now private to the Services “A”. In other words, even though the data is still residing in the Main DB we have the process of extracting a microservice completed. The next logical step, of course, is to migrate this data to a new database i.e. reap the benefits of having a microservice by changing its data storage. This is now easy, the process is the following:

1, Create a Main DB read replica (this will be the new master database for microservices).
2. Wait for zero replica lag i.e. all data has been replicated.
3. Create read/write user in the replica exclusive for the new service
4. Reconfigure all DB reads to go to the new read replica.
5. Stop the new Services “A” i.e. all writes to the old DB.
6. Reconfigure all writes to go to the new read replica.
7. Wait for zero DB lag. Promote the read replica as a new master node.
8. Start the new service.

At a later stage, we can remove the irrelevant tables from the newly promoted database and same we can do to the old master database remove tables which we have migrated to the newly promoted database.

This is a simple two-step process, where first we restrict services on a set of tables after defining table ownership by using database privileges model and at a later stage, we just create a slave database for data migration.

This article was written by Khurram Naseem, a database architect at Careem, with contributions from Mohsin Kerai and Lamak Qaizar.