vNetC and SDLC Configuration for High Availability Deployments🔗
Verity supports an automatic failover solution which provides support for high availability (HA) in the orchestration platform.
The solution requires two host systems. As these are operated in active/standby mode, each system must be sized to handle the full system load according to the resource calculator referenced in the installation instructions.
Initially, support is provided for the KVM hypervisor running on Ubuntu 24.04 LTS, but the system has limited host system dependencies and can be configured using different hypervisors and host operating systems. The instructions that follow are assuming Ubuntu KVM but the user can adapt to their brand of hypervisor as required.
This section includes installation information as well as considerations for Verity upgrades while systems are in HA mode.
Installation Overview🔗
Each system is installed per the standard instructions in the Quickstart section of this documentation. This section contains a configuration worksheet that should be filled out ahead of time with the overall network plan and address assignments in the environment containing the two host machines.
The process contains the following steps:
- Review the contents of this complete section
- Complete the Configuration Worksheet
- Install each system per standard instructions. It is important that the steps of upgrading vNETC and SDLC to the latest release are done per instructions before proceeding the next step.
Important
The sections below titled "Creating the vNETCs" and "Creating the SDLCs" have specific information to guide the standard installation process.
- Configure vNetC HA mode
System Block Diagrams🔗
Verity supports both single external (server) NIC and two external NIC configurations for installation. Please refer to those diagrams for the KVM installation within the Quikstart section as your base starting point for HA.
When using HA, the two installation options become either a two external NIC or three external NIC installations, respectively. The additional external NIC is used for the cross-connect communications between the two servers. The user may chose to use bond groups which increases the number of external NICs in their servers for any interfaces, but this is transparent to the Verity vNETC.
In normal operating mode, the active vNETC periodically transfers important state to the inactive vNETC in addition to continuously replicating POSTGRES database updates.
The additional state includes:
- System configuration
- Master key vault
- Statistics data
The management switch architecture is up to the user (e.g. single switch, MCLAG etc.) and is also transparent to the vNETCs.
The HA system level diagrams for both installation options is shown below:
Two External NIC Block Diagram🔗
Three External NIC Block Diagram🔗
Configuration Worksheet🔗
Token | Description | Example | Your Value |
---|---|---|---|
MGMT_NET | The management network address | 10.0.0.0/24 | |
WAN_NET | The WAN network address (can be the same as MGMT_NET ) | 10.0.0.0/24 | |
WAN_GW | The gateway address for the WAN network | 10.0.0.1 | |
FLOATING_IP | The IP address that will access the Web GUI and float between redundant nodes | 10.0.0.10/24 | |
FQDN | The FQDN for the floating IP (Needed if GUI will be accessed using a domain name) | vnetc.domain.com | |
BACK_NET | The backdoor network used to manage the SDLC | 10.12.99.0/30 | |
XCONN_NET | The cross-connect network address | 169.254.0.0/30 | |
VERIFICATION_IP | The IP address of a switch management port that responds to ICMP echo-request | 10.0.0.99/24 |
Common SDLC elements🔗
Token | Description | Example | Your Value |
---|---|---|---|
SDLC_IP | SDLC IP address on management network (same for both SDLCs) | 10.0.0.9 | |
SDLC_SN | SDLC serial number (same for both SDLCs) | SDLC-A | |
SDLC_HOSTNAME | SDLC hostname (same for both SDLCs, can be the same as SDLC_SN ) | SDLC-A | |
ACS_IP | ACS IP address on management network (same for both SDLCs) | 10.0.0.3 |
KVM host elements🔗
Token | Description | Example | Your Value |
---|---|---|---|
HOST1_IP | The IP address for accessing the first KVM server host on the management or WAN network | 10.0.0.11/24 | |
HOST2_IP | The IP address for accessing the second KVM server host on the management or WAN network | 10.0.0.12/24 |
Node 1 specific elements🔗
Token | Description | Example | Your Value |
---|---|---|---|
NAME1 | The hostname for first vNetC | vnetc1.domain.com | |
VNETC1_IP | The IP address of the first vNetC in the WAN_NET or MGMT_NET network | 10.0.0.21/24 | |
VNETC1_BACK_IP | The IP address of the first vNetC in BACK_NET | 10.12.99.1/30 | |
SDLC1_BACK_IP | The IP address of the first SDLC in BACK_NET | 10.12.99.2/30 | |
VNETC1_XCONN_IP | The IP address of the first vNetC in XCONN_NET | 169.254.0.1/30 |
Node 2 specific elements🔗
Token | Description | Example | Your Value |
---|---|---|---|
NAME2 | The hostname for second vNetC | vnetc2.domain.com | |
VNETC2_IP | The IP address of the second vNetC in the WAN_NET or MGMT_NET network | 10.0.0.22/24 | |
VNETC2_BACK_IP | The IP address of the second vNetC in BACK_NET (can be the same as VNETC1_BACK_IP ) | 10.12.99.1/30 | |
SDLC2_BACK_IP | The IP address of the second SDLC in BACK_NET (can be the same as SDLC1_BACK_IP ) | 10.12.99.2/30 | |
VNETC2_XCONN_IP | The IP address of the second vNetC in XCONN_NET | 169.254.0.2/30 |
Host O/S installation🔗
The host should be installed with the Ubuntu-LTS server edition, with sshd enabled.
apt update; apt upgrade -y; apt autoremove -y
apt install -y net-tools iputils-ping rsync curl wget traceroute htop
apt install -y qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils smartmontools
Time synchronization🔗
The High Availability system requires that clocks on the two vNetCs are synchronized. This is set up automatically but will work only if the systems have NTP access to the internet. If not, NTP must be configured on the vNetCs to operate with an accessible NTP server on the local network.
Addressing and bridges🔗
The host should have at least two physical NICs which we'll call "eno1" and "eno2".
"eno1" should be connected to the WAN network, and "eno2" should be connected
to the "eno2" interface on the redundant pair.
If there is a management network separate from the WAN network, then it should be connected to a third physical NIC which we'll call "eno3".
The WAN network should be accessible to administrators who will use the vNetC GUI. This will have:
- two fixed addresses, one for each vNetC VM
- one floating address, used to access the active vNetC GUI
The management network should be configured with an IP space large enough for all managed
devices to have an IPv4 address, plus:
- one fixed address shared by the two SDLC VMs
- one fixed address shared by the two ACS containers
The "eno2" interface is presented to the vNetC VM via a bridge "xconn". It should
have a /30 interface (or larger). As this is a point-to-point connection, we
typically use a link-local address like "169.254.0.1/30".
Each vNetC needs to be able to manage the SDLC running on the local host. This
is done using a bridge called "back". This should be configured with an IP space
large enough for a vNetC and each SDLC, so a /30 is the minimum size, but it makes
sense to use a larger block to allow for future growth. In these examples we use
"10.12.99.1/24".
In these examples, the netplan format is used, but it is also possible to use KVM
network configurations. The components (vNetCs and SDLCs) are all configured
with static addresses, so there is no need to provide DHCP support on any of the
bridges.
Netplan configuration examples🔗
Netplan example with management network and WAN combined. In this case, the third interface is not connected and not used but is needed to keep a consistent interface order.
network:
ethernets:
eno1:
dhcp4: false
dhcp6: false
eno2:
dhcp4: false
dhcp6: false
bridges:
wan:
interfaces: [eno1]
addresses: [HOST1_IP]
dhcp4: false
dhcp6: false
routes:
- to: default
via: WAN_GW
xconn:
interfaces: [eno2]
dhcp4: false
dhcp6: false
mgmt:
interfaces: []
dhcp4: false
dhcp6: false
back:
interfaces: []
dhcp4: false
dhcp6: false
version: 2
Netplan example with separate management network and WAN. In this case, the third interface is connected to the management port (eno3). Differences from above are highlighted in green.
network:
ethernets:
eno1:
dhcp4: false
dhcp6: false
eno2:
dhcp4: false
dhcp6: false
eno3:
dhcp4: false
dhcp6: false
bridges:
wan:
interfaces: [eno1]
addresses: [HOST1_IP]
dhcp4: false
dhcp6: false
routes:
- to: default
via: WAN_GW
xconn:
interfaces: [eno2]
dhcp4: false
dhcp6: false
mgmt:
interfaces:[eno3]
dhcp4: false
dhcp6: false
back:
interfaces: []
dhcp4: false
dhcp6: false
version: 2
Creating the vNetCs🔗
Follow the verity quickstart instructions to install the vNetC under KVM.
In the vNetC xml:
-
change the name to 'vnetc1' or some appropriate name that indicates this is node 1
-
replace the three
<interface>
definitions with these definitions (newer xml files may already be in this format):
<interface type='bridge'>
<source bridge='wan'/>
<model type='virtio'/>
<link state='up'/>
</interface>
<interface type='bridge'>
<source bridge='xconn'/>
<model type='virtio'/>
<link state='up'/>
</interface>
<interface type='bridge'>
<source bridge='mgmt'/>
<model type='virtio'/>
<link state='up'/>
</interface>
<interface type='bridge'>
<source bridge='back'/>
<model type='virtio'/>
<link state='up'/>
</interface>
Continue to "virsh define" the vnetc and start the vnetc per the standard instructions.
If the management and WAN networks are combined, clear the Management Address,
otherwise set:
- Management Address:
MGMT_NET
The same procedure is used to create the second vNetC, with appropriate variable changes (eg VNETC2_IP
instead of VNETC1_IP
)
Creating the SDLCs🔗
For all configurations:
- the first interface to type 'bridge', source bridge='back'
- the second interface bridge to source bridge='mgmt'
- the third interface is not used and should be removed
Follow the instructions to install the first SDLC for KVM
- Use admin/wizard to set
SDLC_HOSTNAME
,SDLC_IP
,ACS_IP
, NTP address will beVNETC1_IP
The same procedure is used to create the second SDLC, with appropriate variable changes (eg VNETC2_IP
instead of VNETC1_IP
)
Important
Do not proceed to the next steps until you have both systems installed and running the latest provide core and firmware loads for Verity Release 6.4.
Configure vNetCs for HA mode🔗
On the console, edit /etc/rc.conf
:
- Change the vtnet1 IP to
VNETC1_XCONN_IP
- Add an entry:
ifconfig_vtnet3="inet VNETC1_BACK_IP "
Then run ns_admin to set the NAME1, VNETC1_IP, WAN_GW,
and DNS servers (if any).
Create the file "/var/ns/arg_defaults/ns_httpd.conf" and add:
append_allowed = 'FQDN'
If the GUI will be accessed via an IP address rather than an FQDN, use:
append_allowed = 'FQDN,FLOATING_IP'
It is recommended to replicate the host keys from vNetC1 to vNetC2, eg from vNetC1:
rsync -ai /etc/ssh/ssh_host_* VNETC2_IP:/etc/ssh/
Create an ssh key to make connections between vNetCs easier:
ssh-keygen -t ed25519
/var/ns/ha/failover.conf
with:
:include "/usr/ns/etc/failover.conf"
enabled = 1
floating_address = FLOATING_IP
xconnect_address = VNETC1_XCONN_IP
peer_xconnect_address = VNETC2_XCONN_IP
peer_mgmt_address = VNETC2_IP
verification_address = VERIFICATION_IP
sdlc_mgmt_address = SDLC1_BACK_IP
On vNetC 2, create /var/ns/ha/failover.conf
with:
:include "/usr/ns/etc/failover.conf"
enabled = 1
floating_address = FLOATING_IP
xconnect_address = VNETC2_XCONN_IP
peer_xconnect_address = VNETC1_XCONN_IP
peer_mgmt_address = VNETC1_IP
verification_address = VERIFICATION_IP
sdlc_mgmt_address = SDLC2_BACK_IP
Other variables that alter the default behavior are as follows. Where values are shown, these are already used as the defaults and the variables do not need to be included in /var/ns/ha/failover.conf.
Settle Timing
This specifies the period in seconds before state change actions are taken after decisions have been reached.
A rule-of-thumb is to make the settle period be the reboot time for the server hardware plus 60 seconds.
settle = 300
Probe Cycle
This is the cycle period for probing the peer. Probing is low-overhead so can be quite short, but should be more than 3 seconds
probe_cycle = 10
Replication Period
How frequently in seconds to run replication of low priority data from the active to the inactive node. This data include device statistics, non-database configurations like NTP and resolver settings, and forensic application data for ns_bizd, etc.
replication_period = 1800
Replication Delay
Delay after a failover before replication will be performed.
replication_delay = 300
Manual Override
Manual_override should be empty for normal operations. If true, this system will become active, if false this system will become inactive. If this setting is made on just one node, the other node will transition to the appropriate state, assuming communications with this node are operational. (This variable is also administered via ns_admin menu)
manual_override =
Slave Autoconversion
If slave_auto_conversion is true, a node will be automatically converted to a postgres slave once both the run state and postgres state have been stable for slave_transition_delay seconds.
slave_auto_conversion = true
Slave Transition Delay
Slave_transition_delay sets how long to wait after a node becomes inactive before attempts a slave conversion.
slave_transition_delay = 360
Reboot both vNETCs
Configure SDLCs for HA mode🔗
Use these steps to configure the SDLC the admin CLI:
- Use admin/network/backdoor to set
SDLC1_BACK_IP
mask 255.255.255.x - Use admin/security/keys/get - to set an authorized key, taken from /var/ns/ha/sdlc_ha_key.pub
Main/ administration/ Security Base Menu# keys
Main/ administration/ Security Base Menu/ SSH Keys Base Menu# get -
Enter file (Press <CTRL-D> when done:
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIHHrFWYS5snSDRNV6mN6rZkWMn1Xp3CIXgn4x5HvSJyP root@verity.be-net.com
Authorized keys 'authorized_keys' copied to '/home/admin/.ssh/authorized_keys'
Main/ administration/ Security Base Menu/ SSH Keys Base Menu#
- Use admin/redundancy/serial to set
SDLC_SN
- Use admin/redundancy/enable to enable redundancy mode
Reboot the SDLC. When it reboots it will be in Standby mode
Both SDLCs are set to standby mode waiting for the vNETC to bring online the active SDLC.
Upgrading an HA site🔗
HA sites must be upgraded in a specific sequence, which should be completely without long delays between steps.
This is because during the upgrade, schema changes to the Postgres database may be involved, and these are immediately replicated to the inactive node, which means the BE software on the inactive node may be inconsistent with the Postgres schema.
In addition, it is not possible to do an offline upgrade to the inactive node because upgrades require write-access to Postgres.
Upgrade Steps🔗
- Upgrade the active node via the Administration -> Software Packages page (link to instructions??)
- After the upgrade has completed, confirm the system has been left in read-only mode. The topology page should display a tan pin-wheel icon. Leave the system in read-only mode.
- Log into the vNetC console of the active node and run
ns_admin
- Choose "High Availability Configuration" and "Replicate Config and Statistics"
- When that completes, choose "Set Manual Override" and set it to False and "Save Settings". This will trigger a failover to the inactive node.
- Once the GUI is accessible on this node, perform the upgrade in the normal way via Administration -> Software Packages. If necessary, delete the current package, re-upload it, and re-deploy it.
- Once the upgrade has completed, back on the console of the now-inactive node, use
ns_admin
to:- Choose "Set Posgress to Slave Mode"
- Clear the "Set Manual Override" and "Save Settings" to enable failover
- After verifying any configuration changes, the system can be returned to read-write mode