Deep Dive

Deep Dive

/ by Marek  / , , , , , , ,  + .

High Availability L2TP LNS Steering with FreeRADIUS and ExaBGP

We are in the final stages of coming onboard with Zen Internet for wholesale L2TP services. Yesterday their technical delivery team moved on to testing our RADIUS integration. We were a little nervous that this would not work first time as most L2TP wholesale providers just expect their customers to interoperate perfectly with the appropriate RADIUS responses. The tests were successful, thanks to plenty of careful preparation.

Given our extensive experience using FreeRADIUS we were fairly confident it would be part of our solution so started there. FreeRADIUS powers the access control on our own management networks, and authenticates multi-network 3G/4G wholesale SIMs onto the partner mobile carrier networks for Key IoT. We just needed to work out how to make it sing the right tune for L2TP. We didn’t have a large vendor’s stack to fall back on, so started from the RFCs and other implementations’ documentation. We hoped that this would steer us (excuse the pun) to build the right configuration.

Our requirements when designing our L2TP tunnel-steering RADIUS configuration were as follows:

  • respond to RADIUS requests for multiple realms
  • steer realms to be L2TP tunneled to one or more appropriate LNSs
  • per-realm and/or per-LNS tunnel secrets to support multi-tenancy
  • load-balance incoming sessions across the realm’s LNSs
  • automated deployment of nodes’ configurations using SaltStack
  • use BGP to advertise RADIUS server IPs to the wholesale/carrier networks
  • nodes should anycast their RADIUS service IP addresses into our IGP
  • anycast addresses should only be announced if the service is running; withdraw if the RADIUS serving process is unresponsive

Salt Pillar Data

We define all the realms, LNSs and L2TP tunnel secrets in pillar data so that it’s available across all the relevant minions. The structure we use is like this (shown below in Salt’s standard SLS/YAML format):

direct:                     # pillar data for XDSL realms and their LNS steering

  realms:                   # mapping from realm names (user@realm) to realm data
    realm1.example.com:
      lns:                  # mapping of LNS IP address to LNS L2TP shared secret
        192.0.2.1: hunter2

    realm2.example.com:
      lns:                  # steering to randomly-ordered set of multiple LNSs
        192.0.2.2: hunter2
        192.0.2.3: hunter2
        192.0.2.4: hunter2
        192.0.2.5: hunter2

    healthcheck:            # fake realm for automated healthchecks
      lns:
        127.0.0.1: success

  redirect:                                     # user goes to specified realm's LNSs
    test@other.example.com: realm1.example.com

Based on our experience while helping a customer migrate the infrastructure we added functionality to redirect individual users’ sessions to other realm’s LNSs. This can be used to help customers either, for example, during a realm swing; or it could be used to redirect a problematic session to a debugging LNS to determine why a particular session is failing to establish.

FreeRADIUS Site

This Jinja template is deployed with a file.managed state by Salt as an available site for FreeRADIUS.

{% set direct = salt['defaults.merge'](salt['pillar.get']('direct',{}), salt['grains.get']('direct',{})) %}

server direct {
    {# elided "listen" and "client" sections #}

    authorize {
        if ("%{request:User-Name}" =~ /^(.+)@(.+)/) {
            update {
                request:Realm := "%{2}"
            }
        }

        {% if direct.get('redirects',{}) %}
            switch &request.User-Name {
                {% for redirect, destination in direct.get('redirects',{}).items() %}
                    case "{{ redirect }}" {
                        update {
                            request:Realm := "{{ destination }}"
                        }
                    }
                {% endfor %}
            }
        {% endif %}

        switch &request:Realm {
            {% for realm, realmdata in direct.get('realms',{}).items() %}
                case "{{ realm }}" {
                    {% set lns = realmdata.get('lns',{}).items()|list %}
                    {% for offset in range(lns|length) %}
                        {% if loop.first and not loop.last %}
                            switch "%{rand: {{ lns|length }}}" {
                        {% endif %}
                        {% if not ( loop.first and loop.last ) %}
                                case "{{ offset }}" {
                        {% endif %}
                                    update {
                                        {% for i in range(lns|length) %}
                                            {% set j = (i+offset) % (lns|length) %}
                                            reply:Tunnel-Server-Endpoint:{{i}} = "{{ lns[j][0] }}"
                                            reply:Tunnel-Password:{{i}} = "{{ lns[j][1] }}"
                                            reply:Tunnel-Type:{{i}} = L2TP
                                            reply:Tunnel-Medium-Type:{{i}} = IP
                                        {% endfor %}
                                        control:Auth-Type = "Accept"
                                    }
                        {% if not ( loop.first and loop.last ) %}
                                }
                        {% endif %}
                        {% if loop.last and not loop.first %}
                            }
                        {% endif %}
                    {% endfor %}
                    ok
                }
            {% endfor %}
        }
    }
}

The above template ends up generating a configuration file which is fairly verbose (for additional load-balancing functionality), but is functionally similar to this:

server direct {
    client healthcheck {
            ipv4addr = 127.0.0.1/32
            shortname = healthcheck
            secret = healthcheck
    }
    # client wholesale { ... etc ... }

    authorize {
        if ("%{request:User-Name}" =~ /^(.+)@(.+)/) {
            update {
                request:Realm := "%{2}"
            }
        }

        switch &request.User-Name {
            case "test@other.example.com" {
                update {
                    request:Realm := "realm1.example.com"
                }
            }
        }

        switch &request:Realm {
            case "healthcheck" {
                update {
                    reply:Tunnel-Server-Endpoint:0 = "127.0.0.1"
                    reply:Tunnel-Password:0 = "success"
                    reply:Tunnel-Type:0 = L2TP
                    reply:Tunnel-Medium-Type:0 = IP
                    control:Auth-Type = "Accept"
                }
                ok
            }

            case "realm1.example.com" {
                update {
                    reply:Tunnel-Server-Endpoint:0 = "192.0.2.2"
                    reply:Tunnel-Password:0 = "hunter2"
                    reply:Tunnel-Type:0 = L2TP
                    reply:Tunnel-Medium-Type:0 = IP

                    reply:Tunnel-Server-Endpoint:1 = "192.0.2.3"
                    reply:Tunnel-Password:1 = "hunter2"
                    reply:Tunnel-Type:1 = L2TP
                    reply:Tunnel-Medium-Type:1 = IP

                    # reply:Tunnel-Server-Endpoint:2 = "192.0.2.4" ... etc ...

                    control:Auth-Type = "Accept"
                }
                ok
            }
            case "realm2.example.com" {
                # ... etc ...
            }
        }
    }
}

ExaBGP Health Check

To periodically check that FreeRADIUS is running and announce or withdraw the relevant loopback addresses via BGP we created a small shell script. This can be installed as /usr/local/sbin/radius-steering-healthcheck for example.

We set the addrs variable to be the list of non-loopback IP addresses of the lo interface. This is how we define the node’s /32 address(es) to advertise into BGP if FreeRADIUS is running correctly.

The username can be anything@healthcheck (so long as the realm and a RADIUS secret match)

#!/bin/bash

addrs=`ip -4 addr show dev lo scope global | grep inet | awk '{ print $2 }'`

while true
do
    if radtest "healthcheck@healthcheck" "healthcheck" "127.0.0.1" "1" "healthcheck" > /dev/null
    then
        for addr in $addrs
        do
            echo "announce route $addr next-hop self"
        done
    else
        for addr in $addrs
        do
            echo "withdraw route $addr next-hop self"
        done
    fi

    sleep 15
done

To use the healthcheck with ExaBGP 4.0 we added the following section to exabgp.conf:

process healthcheck {
    run "/usr/local/sbin/radius-steering-healthcheck";
    encoder text;
}


Bonus Content: Virtual LNSs with VyOS under Xen

While performing our own tests, which included testing with virtual LNSs running as virtualised network functions, we found a problem setting the MTU on interfaces in VyOS 1.3. Another user had posted on the VyOS forum with the same problem eight months earlier. Just like them, our virtualisation stack is running Xen.

What puzzled us is that on a standard Debian Linux virtual machine running on the same NFV cluster we had no problems setting the MTU to support jumbo frames:

root@buster:~# ls -l /sys/class/net/eth0/device/driver
lrwxrwxrwx 1 root root 0 Feb 19 05:43 /sys/class/net/eth0/device/driver -> ../../bus/xen/drivers/vif
root@buster:~# ip link set dev eth0 mtu 2000
root@buster:~# ip link set dev eth0 mtu 9000
root@buster:~# ip link set dev eth0 mtu 1500

And yet on VyOS, itself a Debian-derived distribution, we could not increase MTU beyond 1500:

vyos@equuleus:~$ ls -l /sys/class/net/eth0/device/driver
lrwxrwxrwx 1 root root 0 Feb 19 05:45 /sys/class/net/eth0/device/driver -> ../../bus/xen/drivers/vif

vyos@equuleus:~$ configure
[edit]
vyos@equuleus# set interfaces ethernet eth0 mtu 9000
[edit]
[ interfaces ethernet eth0 ]
VyOS had an issue completing a command.

We are sorry that you encountered a problem while using VyOS.
There are a few things you can do to help us (and yourself):
- Make sure you are running the latest version of the code available at
  https://downloads.vyos.io/rolling/current/amd64/vyos-rolling-latest.iso
- Consult the forum to see how to handle this issue
  https://forum.vyos.io
- Join our community on slack where our users exchange help and advice
  https://vyos.slack.com

When reporting problems, please include as much information as possible:
- do not obfuscate any data (feel free to contact us privately if your
  business policy requires it)
- and include all the information presented below

Report Time:      2021-02-19 05:46:48
Image Version:    VyOS 1.3-rolling-202012251712
Release Train:    equuleus

Built by:         autobuild@vyos.net
Built on:         Fri 25 Dec 2020 17:12 UTC
Build UUID:       debe2b8f-0337-4bbf-8d5a-6bfd80111dd3
Build Commit ID:  6bf09791f97fae

Architecture:     x86_64
Boot via:         installed image
System type:      Xen PV guest

Hardware vendor:  Unknown
Hardware model:   Unknown
Hardware S/N:     Unknown
Hardware UUID:    Unknown

OSError: [Errno 22] Invalid argument

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/libexec/vyos/conf_mode/interfaces-ethernet.py", line 102, in <module>
    apply(c)
  File "/usr/libexec/vyos/conf_mode/interfaces-ethernet.py", line 94, in apply
    e.update(ethernet)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/ethernet.py", line 298, in update
    super().update(config)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/interface.py", line 1264, in update
    self.set_mtu(config.get('mtu'))
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/interface.py", line 361, in set_mtu
    return self.set_interface('mtu', mtu)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/control.py", line 182, in set_interface
    return self._set_sysfs(self.config, name, value)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/control.py", line 166, in _set_sysfs
    self._sysfs_set[name]['location'].format(**config), value)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/control.py", line 132, in _write_sysfs
    f.write(str(value))
OSError: [Errno 22] Invalid argument



[[interfaces ethernet eth0]] failed
Commit failed

The problem did not seem to be one caused directly by VyOS, as we could not use the standard Linux tools to make the same change:

vyos@equuleus:~$ sudo ip link set dev eth0 mtu 9000
RTNETLINK answers: Invalid argument
vyos@equuleus:~$ sudo ifconfig eth0 mtu 9000
SIOCSIFMTU: Invalid argument
vyos@equuleus:~$ echo 9000 | sudo tee /sys/class/net/eth0/mtu > /dev/null
tee: /sys/class/net/eth0/mtu: Invalid argument

Delving into the Linux kernel we found that Xen’s xen-netback driver has a particular quirk: if “scatter-gather” offloading (can_sg) is disabled then xenvif_change_mtu will limit the maximum MTU to 1500. Checking a VyOS virtual LNS we could see that, by default, this offloading is disabled:

vyos@equuleus:~$ sudo ethtool -k eth0
Features for eth0:
rx-checksumming: on [fixed]
tx-checksumming: on
  tx-checksum-ipv4: on [fixed]
  tx-checksum-ip-generic: off [fixed]
  tx-checksum-ipv6: on
  tx-checksum-fcoe-crc: off [fixed]
  tx-checksum-sctp: off [fixed]
scatter-gather: off
  tx-scatter-gather: off
  tx-scatter-gather-fraglist: off [fixed]

This was easily fixed by enabling sg offload in VyOS as follows:

set interface ethernet eth0 offload sg
set interface ethernet eth0 mtu 9000

We reported our findings to the VyOS project in a bug report, and also followed-up on the thread from the other user having the same problems in a similar environment.