After deploying a single FTD at the remote site my next task was to configure two FTDs in a redundant High-Availability (HA) setup with Active/Passive fail-over. Again, I had to make sure the following requirements are met.
- configure FTD at the staging site with limited onsite availability
- ssh remote-access after shipping to the remote site
- recovery in case of FTD/FMC communication failure
In addition, I had to take into consideration the standby unit and what happens when a fail-over event takes place and how I would manage this FTD from FMC.
FTD deployment and the initial setup are similar to what I posted earlier. Here I’ll just add some additional notes.
Step 1 – Design overview.
From the physical cabling perspective connectivity is as follows. Two FTDs (FTD1 – Active and FTD2 – Standby) and two network switches. Each FTD connects to its own switch for redundancy. FTD Port 1 is the primary ISP. FTD Port 3 is the secondary ISP. The assumption is to have /29 public on both ISPs to support outside interface HA. FTD Port 2 is the inside. FTD Ports 5 and 6 are linked to each other for HA and StateLink connectivity sync. I’ve used IPs from 198.18.0.0/15 range so I do not get an overlapping IP notification. FTD Management (Mgmt) ports are also connected to the switches in this setup.
To validate HA functionality I preferred a staged full setup. This will require 3 console connections and a staging ISP. Cabling was done once and Vlan manipulation helps achieve all other tasks afterward.
To start, Mgmt and the staging Internet need to be on the same Vlan. Assign public IP to both FTD Mgmt interfaces, join FTDs to FMC with NATid over public IP. From FMC upgrade/downgrade/patch to match OS versions on both FTDs.
I had major issues with 220.127.116.11 code and fail-over functionality after power loss/hard reboot. Upgrading to 7.0.1 resolved it.
Step 3 – FTD HA IP configuration.
Assign IP addresses to corresponding data interfaces, internal/external routing, DHCP scopes etc on the FTD1. Since the Mgmt interface connects to the network switch no need to setup a dedicated interface on FTD (E1/4 in the original post). Follow Cisco documentation to create FTD HA pair.
Use debug fover sync to troubleshoot any HA sync issues.
Use configure high-availability suspend to disable/re-enable HA state from CLI for any troubleshooting purposes.
Additional helpful information on fail-over testing/troubleshooting can be found here.
Keep an eye on any bugs. This one was a bad one. If FTD was to crash or lose power it would come up in an unstable state causing Active/Active behavior.
Step 4 – FTD HA NAT configuration.
Since there are two firewalls most likely there will be two ISP providers for redundancy. Mgmt interfaces are routed through the network switch and to the FTD pair inside interface IP (see diagram). This will simplify NAT configuration in case of ISP or FTD fail-over. FTD Mgmt subnet is routed on FTD back to the switch IP (.3 in this example).
Routing Mgmt interfaces through the inside makes communication traffic flow the same path for both FTDs.
(i – inside, m – Mgmt, o – outside)
Based on that, the NAT configuration is below. FTD1 has Mgmt IP x.x.x.251. It will be PATed to Outside or Outside2 interface IP depending on the routing preference or fail-over event. When failing over to secondary ISP you MAY need to run manage_procs.pl to reset sftunnel on FTD1. With this setup we should always have FMC to FTD management tunnel as long as there is ISP connectivity.
Keep in mind both firewalls FTD1 and FTD2 are registered to FMC at the same time and need sftunnel to be active. Since public IP is assigned directly to the Mgmt interface it is not an issue but once it switches to Mgmt Vlan behind NAT/PAT FTD2 needs to communicate with FMC somehow. There are a couple of ways to solve this
- VPN tunnel to the data-center (FMC location). Once FTD2 can communicate privately over VPN it will reestablish sftunnel.
- 1to1 NAT for (x.x.x.252 in my case). Need to take into consideration NAT to secondary ISP
- In case of FTD1 complete failure and no VPN tunnel, connect to outside of FTD2 and re-ip Mgmt to .251 to reuse configured PAT.
Worst case scenario based on the above configuration FTD2 sftunnel over VPN tunnel will still be active even after losing FTD1 and primary ISP.
With a redundant ISP setup, it is recommended to adjust Floating Connection Timeouts. When a better route becomes available, this timeout closes connections so a connection is reestablished over a better route. The default is 0 (the connection never times out).
Select Devices > Platform Settings> Edit a FTD policy > Select Timeouts.
Steps 5, 6, and 7 are no different from the original post.
At this point, FTD pair should have a complete configuration with staging public IP information to perform NAT/VPN/Access Policy testing. I’ve not covered VPN tunnel setup with redundant ISPs yet. FMC 7.0 has built-in features to make the process easier and more intuitive. Once I have validated and tested the configuration I’ll share it in a separate post.
Failed FTD recovery.
I actually had to perform this remotely with a single console into failed FTD2. If FTD1 was to fail then all still apply since sftunnel will function over VPN tunnel.
These are the steps that worked for me.
- Shut down the switch port interface connecting the outside interface of the failed unit to avoid IP conflicts
- Replace failed unit and get a console connection to the new unit
- Update switch port Vlan to extend ISP Vlan to the Mgmt interface of the new unit
- Do a factory reset on the new unit, run Setup, and assign outside interface public IP to Mgmt interface in CLI
- Upgrade the new unit to the same major version as the Active unit
- Takes notes on existing HA setup and then break HA in FMC and delete failed unit
- Join new unit to FMC, patch/update new unit to match Active one from FMC
- Set all interfaces on the new unit to match the Active unit and deploy
- Make HA pair and deploy. Keep Mgmt on public IP.
- Add standby IPs to HA pair and deploy.
- Change Mgmt interface to original private IP in CLI, change Mgmt switch port to Mgmt Vlan, un-shut outside interface on the switch going to the new unit outside interface, reset sftunnel on new unit, and do test deploy to confirm connectivity.
- HA state should clear, and sftunnel should be up on both units (assuming you have a VPN tunnel to the FMC location).
- Test fail-over functionality