Introduction
High Availability (HA) means that the database is available always - 24 hours a day, 7 days a week. No matter if we plan the maintenance, no matter what the load is, no matter how many clients we have - the database must be on. SQL High availability (HA) consists of specific technologies, custom solutions, and processes around that.
One might find it surprising. How are we going to take the system down for maintenance and yet keep it running 24/7? There is no magic here, if the system needs to go down, then we need to have some other system that is up and running. Basically, we need to have multiple copies of the database turned on, so when one of the copies goes down, then some other copy can take the load.
However, things may get much harder when we scale things out. We would like to have the data storage as close to the users as possible. We’d like to have the database in the US for the US-based users, and the same for Europe or Australia. If we regionalize our storage like this, then we may face issues when a single entity is modified between continents. We somehow need to decide how to resolve conflicts.
Let’s see how to achieve all of that.
Glossary
Before moving on, let us define some terms.
- Master node
Primary node that controls the communication and servers as a source of truth. Also called primary, source, sender, coordinator.
- Replica
Secondary node that follows the master but doesn’t decide on anything. Also called secondary, follower, slave, receiver.
- Failover
Process in which a new master node is selected.
- WAL (Write-ahead Logging, WAL)
Technique in which we first log the changes to the log file, and then we apply them to the real data.
- Read-only replica
Node which allows to execute read-only queries and cannot modify the data.
- Synchronous operation
Operation that reports its status (success or failure) only after it completes at all nodes.
- Asynchronous operation
Operation that returns before all internal tasks are complete.
Techniques
HA can be guaranteed with multiple approaches, like:
- WAL replication
- Data storage share
- 2-Phase Commit (2PC)
- 3-Phase Commit (3PC)
We’ll cover some of them in this blog post.
WAL replication
WAL replication technique is based on the idea that we can send the Write-ahead Log from one server to another. This way the sender can decide if a given transaction should be executed, next the WAL is sent to the receiver, and then the receiver can apply logically the same changes to the data storage.
There are multiple ways to replicate WAL. We can stream the logs over a TCP/IP connection. In this approach logs are sent as they appear, which minimizes the latency between applying changes in the primary and the secondary nodes. This requires a direct network connection between nodes, and may get unreliable if the connection breaks.
Another way is to use the so-called log shipping. In this approach log files are sent from the primary to the secondary periodically over any suitable channel. This could be SCP, FTP, shared drive, cloud blob storage, etc. This introduces higher latency, but allows the replica to catch up over a longer period of time.
Another approach is to mirror the data storage to log files consistent. This allows to delegate the log shipping to some other infrastructure, so the database can just “assume” files are up to date. This can be achieved via mirroring or sharing a network storage.
It’s possible to use all these approaches at the same time. This way we can minimize the latency (thanks to log streaming), and provide reliably in case of failures (thanks to log shipping). If we can use only one way, then log shipping provides higher reliability in general.
Depending on the database engine we use, there may be many more flavors of the WAL replication. They can use NAS, shared drives, network connections, middleware agents, etc. For example, see PostgreSQL documentation.
By default the replication works in an asynchronous mode which means that the transaction is committed by the primary server, and then the logs are sent to the replica. WAL replication can be also configured in a synchronous mode in which the log is sent to the replicas, and the transaction is confirmed only after the replicas receive the changes.
Data storage share
Data storage share approach works by sharing the same data storage between database instances. This can be as simple as just connecting to the same storage physically (over a wire or a drive interface), or much more complex with shared DNS name, IP address, operating system, etc.
Not all databases support this approach. Some engines need additional configuration on the operating system level infrastructure, for instance Windows Server Failover Cluster.
This approach is highly configurable and allows for easy and fast failover. However, it may require that only one host is a master, and all other hosts are read-only replicas.
2-Phase Commit and 3-Phase Commit
2-Phase Commit (2PC) and 3-Phase Commit (3PC) are protocols for running distributed transactions. They work on the idea that changes to the database are committed atomically across all databases. General idea is:
- There is a coordinator that knows what changes to introduce
- The coordinator opens up a transaction on each of the nodes
- The coordinator sends the changes
- Each node prepares the changes and answers to the coordinator whether the transaction can be committed or not
- The coordinator makes the final decision - either to commit the transaction or roll it back
This protocol ensures that the changes are applied atomically to all the databases. 3PC changes in the way how the catastrophic outage is handled to improve the reliability.
However, both 2PC and 3PC introduce additional roundtrips over the wire that may reduce the performance significantly. However, they provide great features of ACID across distributed databases.
2PC can be used in PostgreSQL with the Prepare Transaction statement. The typical flow is:
How can Metis help?
Metis provides an observability dashboard that can track and follow your performance. It monitors your configuration, extensions, active queries, indexes, tables, daemon processes and much more.
Metis can keep your databases always on and notify you when issues emerge. You can use it to validate your configuration and observe it over time to make sure that things work as expected.
Summary
Data storage is the most crucial component of our systems. We can’t let it go down, we need to make sure it works as expected and is always available. There are multiple ways to guarantee the High Availability, based on log shipping, data sharing, or even distributing transactions. Metis provides the ultimate monitoring solution to make sure your databases are always on. Sign up for free to test Metis now or book a demo with one of our reps for a walkthrough.