The cluster module enables two stand-alone HSMX gateways to form an highly redundant cluster. One HSMX will be set the active node and will regulate all traffic, it's twin HSMX will be the passive node and remains stand-by. The machines communicate constantly with each other to operate as one.
In a healthy cluster When the active node becomes unreachable the passive node performs a smooth fail-over. This means that the passive HSMX takes over the active node role and will start processing traffic. When two machines are clustered and the passive node fails, the active node will log this event but no action will be taken. HSMX only fails-back onto the original active node if it comes back online and is more healthy.
During the role switch most subscriber generated network traffic will experience little to none connection interruptions and most services used by subscribers will transition without the user noticing.
Typically NAT is configured to masquerade traffic from subscriber-networks towards upstream. The problem is that depending on the active node; the traffic source will be one or the other node. Using a shared-IP solves this issue because the active node will always use this virtual IP-address to send and receive traffic on. If traffic is sent upstream and a failure occurs, the new active node will listen, receive and relay the response from upstream to the correct destination using the shared-IP.
The HSMX cluster nodes need to communicate with each-other to behave as a single-entity. For this communication two IP-addresses can be configured. It's recommended to employ a dedicated network interface solely used for the cluster communication and use the WAN-interface as fallback cluster interface.
Tip: The cluster link may be routed although it's not a best-practice.
Both nodes should be connected to the same subscriber network (segment/subnet). The HSMX will only activate the subscriber network IP-address on the active node and upon node switch issue a notification (GARP) that the new machine is active.
Before setting-up a cluster make sure both machines are in healthy-state. Network configuration and firewall settings should be set on both machines. The cluster settings will only have to be applied on one of the nodes. Two clustered HSMX nodes will share the entire configuration except for the following items:
Throughout this recipe the same network configuration will be assumed. WAN-upstream connectivity is guided through 172.20.0.1/22 on Port WAN (rename-interfaces) of both machines. The cluster link is setup on Port Cluster Node <N> and the high-available subscriber network should be configured on both machines!
The default firewall configuration are good start for a single-node HSMX setup. We will adjust the firewall settings for the cluster-link communication. Two clustered HSMX machines have the following minimum cluster-link network requirements.
|873/tcp||Cluster File Synchronization||RSYNC|
Browse to Security –> Firewall Settings → Add a new port and implement firewall-rules to enable cluster communication. Once each rule is added Apply to restart the firewall.
Cluster Node <N>) or
Allfor a stateless firewall configuration.
Tip: We recommend filling in the
Source-IP. Assuming directionality is set to
Incoming packet we configure the neighbors (HSMX node) cluster-link IP.
Tip: There are predefined rules available when you add a port that will already preset
State and the correct
Port for you. Using this feature will allow you to speed-up the process.
Now we are ready to set up our cluster network. This configuration only takes place on the to be active node. Browse to Network –> Cluster Settings.
Configure the IP-addresses and test the cluster status by clicking the
Test connection button. Cluster status can be used to verify whether the two participating gateways can communicate properly using the configured communication IP addresses. When all is set and the connection is successful you can activate the cluster.
Tip: You can simply press the icon on the right to retrieve the data of the other gateway automatically.
Note: PPPoe interfaces cannot be used for cluster configuration.
The cluster-link employs a four-state cluster-status based on contact with the other node and it's health-status. These settings govern the trigger of automatic fail-over. Setting these values can low can result in unstable cluster and too high in delayed fail-over upon failure, down-time inducing.
Note: all time related variables are in seconds, use comma to specify up to microsecond precision.
Note: make sure keep identical configurations of these settings on both nodes
Note: advanced cluster settings are applied near-immediately when the cluster is activated.
Everything is shared between two clustered HSMX machines except:
The cluster employs two methods to synchronize the system-state. The database-synchronization can be easily tested by creating a subscriber voucher on the active node. File-synchronization can be tested by changing the admin user-account password in System→Access Control→Users. If you can successfully sign-in using the new credentials on the passive node and the database is synchronized your cluster is up and running.
directionality of firewall rule (