High availability models

You can implement highly available systems in various ways, and the standard models dictate how the system behaves when a component failure occurs.

The following table summarizes the attributes of the various high availability models. The system recovery time is based on the optimum cluster configuration and varies with the products that are used.

High availability model Secondary node behavior Data protection Failover time
Load-balanced Both the primary node and the secondary node are active and they process system requests in parallel. Data replication is bidirectional and is performed based on the software capabilities. Zero failover time
Hot standby The software component is installed and available on both the primary node and the secondary node. The secondary system is up and running, but it does not process data until the primary node fails. Data is replicated and both systems contain identical data. Data replication is performed based on the software capabilities. A few seconds
Warm standby The software component is installed and available on the secondary server, which is up and running. If a failure occurs on the primary node, the software components are started on the secondary node. This process is automated by using a cluster manager. Data is regularly replicated to the secondary system or stored on a shared disk. A few minutes
Cold standby A secondary node acts as the backup for an identical primary system. The secondary node is installed and configured only when the primary node breaks down for the first time. Later, in the event of a primary node failure, the secondary node is powered on and the data is restored while the failed component is restarted. Data from a primary system can be backed up on a storage system and restored on a secondary system when it is required. A few hours

Cluster topologies

Cluster topologies are classified by the level of high availability that they provide. You can configure the cluster to achieve the level of redundancy you need in case of software or hardware failures. You can introduce cluster management software to reduce system recovery time.

N+1
A single, secondary node 1 is activated to take over the role of the failed node. If heterogeneous software is configured on each primary node, then the secondary node must be able to assume any of the roles of the primary nodes. This solution can be used for clusters that are running multiple services simultaneously. Single-service clusters can use a simple active-passive configuration.
N+M
If a single cluster manages multiple services, sufficient redundancy cannot always be provided by a single, dedicated failover node. In such cases, more than one M standby node must be available. An organization must weigh the cost of implementation against the need for system reliability. As the number of standby nodes that are employed increases, so does the cost of maintenance.
N-to-1
A secondary node becomes an active, temporary replacement, until the primary node can be restored. When the primary node is restored, the services or instances must be reactivated to restore high availability.
N-to-N
N-to-N clusters redistribute the services or instances from the failed node across the remaining active nodes. The need for a standby node is eliminated, but extra capacity must be available on all of the active nodes. The N-to-N model is a combination of active-active and N+M cluster configurations.


Feedback