Cluster: Basic Information and Availability Levels
Definition of Availability
The options on the Keyfactor Next Generation Hardware Appliance Cluster page allow you to add cluster nodes, monitor an existing cluster, and manage its cluster nodes. Here you find detailed information about the cluster members and their current status. In addition, an easy-to-use locking function prevents editing conflicts.
The availability is defined as the ability to keep the service running with full data integrity for the applications running on the Next Generation Hardware Appliance.
Levels of Availability
A distinction is made between three levels of availability:
Stand-Alone Instance
This is a basic single node installation of the Next Generation Hardware Appliance.
In case of a node failure, a new appliance needs to be reinstalled from a backup.
All data between the time of the last backup and the failure will be lost.
If no cold standby a (spare) appliance is available, the time required for re-provisioning must be taken into account when calculating the acceptable downtime.
Hot Standby with Manual Fail-Over
In this configuration, two nodes are connected to a cluster. The first installed node has a higher quorum vote than the second node.
If the second node fails, the first node continues to operate. The second node is set to the maintenance state.
If the first node fails, the second node stops operating and is set into maintenance mode.
To bring the second node back into operation, manual interaction via the Next Generation Hardware Appliance administrative interface (WebConf) is required.
Manual intervention is also required to avoid data loss. The second node should only be Forced into Primary if the first node really is dead and cannot be recovered.
High Availability with Automatic Fail-Over
This is a setup with three or more nodes. If one node fails, the remaining nodes can still form a cluster by a majority quorum vote and continue operation. If the appliance that has failed is still switched on it will be set into maintenance.
To ensure that quorum votes never result in a tie, all nodes are assigned a unique quorum voting weight according to their assigned node number (Weight=128−NodeNumber).
In a setup where an even number of nodes N are evenly distributed equally between two sites, the site that is to remain Active when connectivity between the sites fails should have a larger sum of quorum vote weights than the other site.
Since cluster nodes with lower node numbers have a higher weighting, you should deploy nodes 1 to N/2 at the primary site.