Setting up LVS (Linux Virtual Server) for use with dCap

Newer linux kernels have LVS built-in (as well as our UltraLight kernels). See http://kb.linuxvirtualserver.org/wiki/Main_Page for some LVS knowledge-base info and http://kb.linuxvirtualserver.org/wiki/FAQ for the FAQ.

We currently run dcap doors on every pool node for our dCache installation (actually we run 3 on ports 22136, 22137 and 22125). The selection of a dcap door is controlled by a dcache.conf file stored in the /pnfs namespace. (See http://trac.dcache.org/projects/dcache/wiki/ChimeraSetup in section 6 for details)

We would rather use some more intelligent way of finding a door to use for dcap/dccp access so we tried LVS.

Setup Primary LVS Server with Virtual IP

We first need to assign a virtual IP clients can specify to access the service. I setup dcap0.aglt2.org at 192.41.231.200 for this test.

The mini-howto at http://kb.linuxvirtualserver.org/wiki/Mini_Mini_Howto and man ipvsadm have information on the setup.

Using 192.41.231.200 I first setup head02.aglt2.org as the LVS server. I created a simple shell-script to do this (also stored in /afs/atlas.umich.edu/hardware/LVS)

#!/bin/bash
#
# Setup IPVS for dcap0.aglt2.org virtual IP
#
#######################

# Define virtual ip (vip) to use
vip=192.41.231.200

# Enable virtual IP on NIC alias
ifconfig bond0.4001:0 down
ifconfig bond0.4001:0 $vip netmask 255.255.255.255 broadcast $vip up

# Clear existing IPVS config
ipvsadm --clear

# Setup service for dcap0
ipvsadm -A -t $vip:22136 -s wlc
ipvsadm -a -t $vip:22136 -r 192.41.230.24:22136 -g -w 1000
ipvsadm -a -t $vip:22136 -r 192.41.230.27:22136 -g -w 500
ipvsadm -a -t $vip:22136 -r 192.41.230.33:22136 -g -w 800

This script will bring up a new IP alias on our bond0.4001 "NIC". The virtual IP will we specified will be "weighted least-connection" (wlc) balanced across 3 real servers (UMFS04/UMFS07 and UMFS13) each with different weights.

If ipvsadm is not installed do yum install ipvsadm

Running this script on head02 results in a new interface:

[root@head02 ~]# ifconfig bond0.4001:0
bond0.4001:0 Link encap:Ethernet  HWaddr 00:15:C5:F2:7C:B5
          inet addr:192.41.231.200  Bcast:192.41.231.200  Mask:255.255.255.255
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1

Now we need to setup the real servers to respond to the requests

Setting up Real Servers

Each real server needs to be able to respond to packets destined for the virtual IP we have chosen. To do this we want a "local" implementation of the virtual IP which nodes outside this real server don't see. We use the loopback (lo) device to set this up as well as some arp behavior changes to make it work correctly. See http://kb.linuxvirtualserver.org/wiki/ARP_Issues_in_LVS/DR_and_LVS/TUN_Clusters for more details.

I setup another script (also in AFS) to handle the real servers:

[root@umfs04 ~]# cat setup_ipvs_dcap0.sh
#!/bin/bash
#
###################

# Pick VIP
vip=192.41.231.200

# Fix ARP on host
echo 1 > /proc/sys/net/ipv4/conf/eth3/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/eth3/arp_announce
echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce

ifconfig lo:0 $vip netmask 255.255.255.255 broadcast $vip up

This makes sure the real server will not respond for the vip address but will still process the requests that are coming from the LVS server. You will need to determine the correct interface (likely not eth3 !) to use. See http://kb.linuxvirtualserver.org/wiki/Using_arp_announce/arp_ignore_to_disable_ARP for details on arp_ignore and arp_announce settings.

The result on one of the real servers is a new interface:
[root@umfs04 ~]# ifconfig lo:0
lo:0      Link encap:Local Loopback
          inet addr:192.41.231.200  Mask:255.255.255.255
          UP LOOPBACK RUNNING  MTU:16436  Metric:1

Testing LVS for dCache dcap

To test this I just ran 'dccp' for a test file already in our /pnfs/aglt2.org/atlashotdisk/ area:

dccp -d19 dcap://192.41.231.200:22136/pnfs/aglt2.org/atlashotdisk/shawn.test2 /tmp/shawn.test7

You can use the ipvsadm -L --stats command to see the resulting statistics (run the command above many times):

[root@head02 ~]# ipvsadm -L --stats
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port               Conns   InPkts  OutPkts  InBytes OutBytes
  -> RemoteAddress:Port
TCP  dcap0.aglt2.org:22136              14      182        0    13916        0
  -> umfs13.aglt2.org:22136              5       65        0     4970        0
  -> umfs07.aglt2.org:22136              3       39        0     2982        0
  -> umfs04.aglt2.org:22136              6       78        0     5964        0

The weights for each real server should be adjusted to match their relative "power" for serving requests.

NOTE: One thing I noticed is we have way too many dcap doors setup for AGLT2. For the upcoming dCache upgrade I will reduce to using only 1 dcap door per pool node (rather than 3). The doors are only used for the control channel...the pool with the file will respond on the data channel for the client request. Also I still need to test to verify that LVS can work with nodes at MSU which are on a different subnet (192.41.236.0/23 instead of 192.41.230.0/23).

Using an LVS serviced IP we can replace the dcache.conf contents with a single (VIP) instance like:

dcap0.aglt2.org:22136

Rather than the existing (long) list of all possible dcap doors at AGLT2. We can then use LVS on head02 (or anyplace else we want to put it) to handle dcap requests and load-balance them to the real servers.

-- ShawnMcKee - 02 Dec 2010
Topic revision: r1 - 02 Dec 2010, ShawnMcKee
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback