NFS high availability
Ensure high availability of a NFS service.
Objective
Ensure high availability of a NFS service.
Constraints
This note does not explain how to synchronize data between both NFS servers.
In order to preserve file system integrity, usage of NFS servers will be in Active/Passive mode only
Complexity
5
Versions
v4.2.10 and later & v5.5 and later
ALOHA load balancer
NFS high availability
Before starting
Please read the documents listed below:
Synopsis
Web servers use a NFS share to access data they deliver to clients. Data hosted by the NFS servers are accessed through the ALOHA load-balancer.
In order to avoid limiting the performance of the NFS service, we are going to use layer 4 load-balancing in gateway mode (also know as DSR: Direct Server Return).
In this mode, the traffic back from the NFS server to the web server won’t pass through the Aloha.
Diagram
The diagram below shows the flows for such architecture:
The client (here, the web server) pass through the ALOHA to access the NFS service. The NFS server talk directly to the client, bypassing the ALOHA load-balancer.
Configuration
NFS service
Since NFS can use random ports, the configuration must be done in three steps:
- Matching and routing NFS network flows
- Load-Balancing
- LVS service tuning to speed up the fail over
Flow manager
- Click on the GUI Flow tab
- Add the lines below:
flow nfs director nfs match iface eth0 dst 192.168.10.50
Layer 4 load-balancing
- Click on the GUI LB layer 4 tab
- Add the lines below:
director nfs balance roundrobin mode gateway check interval 10 port 2049 timeout 2 option tcpcheck server nfs-01 192.168.10.100:2049 weight 10 check server nfs-02 192.168.10.101:2049 sorry
LVS service tuning
When a fail over occurs, the ALOHA redirects automatically new connections only.
Established connections keep on being redirected to the server which managed them (and which is obviously currently unavailable).
Any NFS client waits up to 15 minutes before opening a new connection, which means the web server won’t be able to deliver any data during that amount of time.
In order to speed up convergence, it is possible to configure the ALOHA to send a TCP RST packet to the client. That way, the client opens a new connection. To do this, we just need to enable two sysctls in LVS service configuration.
- Click on the GUI Services tab
- Click on the LVS service Edit icon and add the lines below:
sysctl expire_nodest_conn=1 sysctl expire_quiescent_template=1
Save and restart the LVS service