High Availability and Clustering

To meet availability requirements, redundancy must be factored in to ensure that the failure of a single instance does not result in downtime, or if multiple nodes are wanted for performance reasons.

The following are two of the redundancy models that SignServer supports:

The choice between the two approaches is essentially a trade-off between isolation and resilience with independent nodes versus operational convenience and performance with shared database clustering.

Load Balancing Considerations

A load balancer can be put in front of client APIs in order to distribute the load between the different nodes, or the clients, such as SignClient, could be given the hostnames of multiple nodes it could choose to send the requests to.

When a load balancer is placed in front of multiple SignServer nodes:

Sticky web connections (session affinity) must be used, as Admin Web sessions can experience issues if routed to a different node mid-session. This does not apply to client signing APIs, which are stateless.
All nodes must have access to the same Workers, ensuring any node can handle any incoming client request.
If nodes share key material, HSM clustering may also be required.

Health monitoring, such as SignServer Health Check, can be to monitor the system and is useful for clusters, as it can be checked by load balancers to determine if a node should be active in the cluster (healthy), or taken out of the cluster (unhealthy).

Option 1: Completely Independent Nodes

Redundancy can be achieved by setting up multiple, completely independent SignServer nodes. That means each node has their own SignServer instance, database instance, and HSM.

Pros

Highest reliability and least risk of the issues of one node affecting others.
Simplest failure isolation.

Cons

Administrative overhead as the configuration must be applied individually on each node.
Each node must have identical worker configurations available to clients.

Key Considerations

For small deployments, manual Admin Web access per node is possible. For larger ones, applying the same Worker configurations across instances is preferred.

Option 2: Shared Database (Clustered Nodes)

Multiple SignServer nodes connect to the same highly available shared database and HSM.

Pros

Key management operations only need to be performed once.
Configuration is centrally stored and accessible from all nodes.
Higher performance potential.

Cons

A problem with the shared database or HSM can affect all nodes simultaneously.
Configuration changes stored in the database are not automatically applied and a manual Reload from Database must be triggered on each node.

Key Considerations

Each instance connects to the same HSM slot/token, either through a remote Network HSM or a locally clustered/replicated HSM.
Sticky web connections are recommended when load balancing, as Admin Web sessions can experience issues if switched between nodes.

For special considerations when it comes to a shared database, see Supported Databases | Multiple SignServer Nodes Sharing the Same Database.

Comparison Summary

	Independent Nodes	Shared Database Clustering
Performance	Standard	Scalable (higher performance potential)
Operational Overhead	Higher (configuration applied per node)	Lower (centralized configuration)
Configuration Propagation	Manual per node	Manual Reload from Database per node
HSM Setup	One HSM per node	Shared or clustered HSM
Fault Isolation	Higher	Lower (shared components are a common failure point)