March 5, 2022

Network juju on the fly: migrating a corporate network to VPN-based connectivity

So, this week mother nature threw South East Queensland a curve-ball like none of us have seen in over a decade: a massive flood. My workplace, VRT Systems / WideSky.Cloud Pty Ltd resides at 38b Douglas Street, Milton, which is a low-lying area not far from the Brisbane River. Sister company CETA is just two doors down at no. 40. Mid-February, a rain depression developed in the Sunshine Coast Hinterland / Wide Bay area north of Brisbane.

That weather system crawled all the way south, hitting Brisbane with constant heavy rain for 5 days straight… eventually creeping down to the Gold Coast and over the border to the Northern Rivers part of NSW.

The result on our offices was devastating. (Copyright notice: these images are placed here for non-commercial use with permission of the original photographers… contact me if you wish to use these photos and I can forward your request on.)

Some of the stock still worked after the flood — the Siemens UH-40s pictured were still working (bar a small handful) and normally sell for high triple-figures. The WideSky Hubs and CET meters all feature a conformal coating on the PCBs that will make them robust to water ingress and the Wavenis meters are potted sealed units. So not all a loss — but there’s a big pile of expensive EDMI meters (Mk7s and Mk10s) though that are not economic to salvage due to approval requirements which is going to hurt!

Le Mans Motors, pictured in those photos is an automotive workshop, so would have had lots of lubricants, oils and grease in stock needed to service vehicles — much of those contaminants were now across the street, so washing that stuff off the surviving stock was the order of the day for much of Wednesday, before demolition day Friday.

As for the server equipment, knowing that this was a flood-prone area (which also by the way means insurance is non-existent), we deliberately put our server room on the first floor, well above the known flood marks of 1974 and 2011. This flood didn’t get that high, getting to about chest-height on the ground floor. Thus, aside from some desktops, laptops, a workshop (including a $7000 oscilloscope belonging to an employee), a new coffee machine (that hurt the coffee drinkers), and lots of furniture/fittings, most of the IT equipment came through unscathed. The servers “had the opportunity to run without the burden of electricity“.

We needed our stuff working, so we needed to first rescue the machines from the waterlogged building and set them up elsewhere. Elsewhere wound up being at the homes of some of our staff with beefy NBN Internet connections. Okay, not beefy compared to the 500Mbps symmetric microwave link we had, but 50Mbps uplinks were not to be snorted at in this time of need.

The initial plan was the machines that once shared an Ethernet switch, now would be in physically separate locations — but we still needed everything to look like the old network. We also didn’t want to run more VPN tunnels than necessary. Enter OpenVPN L2 mode.

Establishing the VPN server

Up to this point, I had deployed a temporary VPN server as a VPS in a Sydney data centre. This was a plain-jane Ubuntu 20.04 box with a modest amount of disk and RAM, but hopefully decent CPU grunt for the amount of cryptographic operations it was about to do.

Most of our customer sites used OpenVPN tunnels, so I migrated those first — I managed to grab a copy of the running server config as the waters rose before the power tripped out. Copying that config over to the new server, start up OpenVPN, open a UDP port to the world, then fiddled DNS to point the clients to the new box. They soon joined.

Connecting staff

Next problem was getting the staff linked — originally we used a rather aging Cisco router with its VPN client (or vpnc on Linux/BSD), but I didn’t feel like trying to experiment with an IPSec server to replicate that — so up came a second OpenVPN instance, on a new subnet. I got the Engineering team to run the following command to generate a certificate signing request (CSR):

openssl req -newkey rsa:4096 -nodes -keyout <name>.key -out <name>.req

They sent me their .req files, and I used EasyRSA v3 to manage a quickly-slapped-together CA to sign the requests. Downloading them via Slack required that I fish them out of the place where Slack decided to put them (without asking me) and place it in the correct directory. Sometimes I had to rename the file too (it doesn’t ask you what you want to call it either) so it had a .req extension. Having imported the request, I could sign it.

$ mv ~/Downloads/theclient.req pki/reqs/
$ ./easyrsa sign-req client theclient

A new file pki/issued/theclient.crt could then be sent back to the user. I also provided them with pki/ca.crt and a configuration file derived from the example configuration files. (My example came from OpenBSD’s OpenVPN package.)

They were then able to connect, and see all the customer site VPNs, so could do remote support. Great. So far so good. Now the servers.

Server connection VPN

For this, a third OpenVPN daemon was deployed on another port, but this time in L2 mode (dev tap) not L3 mode. In addition, I had servers on two different VLANs, I didn’t want to have to deploy yet more VPN servers and clients, so I decided to try tunnelling 802.1Q. This required boosting the MTU from the default of 1500 to 1518 to accommodate the 802.1Q VLAN tag.

The VPN server configuration looked like this:

port 1196
proto udp
dev tap
ca l2-ca.crt
cert l2-server.crt
key l2-server.key
dh data/dh4096.pem
server-bridge
client-to-client
keepalive 10 120
cipher AES-256-CBC
persist-key
persist-tun
status /etc/openvpn/l2-clients.txt
verb 3
explicit-exit-notify 1
tun-mtu 1518

In addition, we had to tell netplan to create some bridges, we created a vpn.conf in /etc/netplan/vpn.yaml that looked like this:

network:
    version: 2
    ethernets:
        # The VPN tunnel itself
        tap0:
            mtu: 1518
            accept-ra: false
            dhcp4: false
            dhcp6: false
    vlans:
        vlan10-phy:
            id: 10
            link: tap0
        vlan11-phy:
            id: 11
            link: tap0
        vlan12-phy:
            id: 12
            link: tap0
        vlan13-phy:
            id: 13
            link: tap0
    bridges:
        vlan10:
            interfaces:
                - vlan10-phy
            accept-ra: false
            link-local: [ ipv6 ]
            addresses:
                - 10.0.10.1/24
                - 2001:db8:10::1/64
        vlan11:
            interfaces:
                - vlan11-phy
            accept-ra: false
            link-local: [ ipv6 ]
            addresses:
                - 10.0.11.1/24
                - 2001:db8:11::1/64
        vlan12:
            interfaces:
                - vlan12-phy
            accept-ra: false
            link-local: [ ipv6 ]
            addresses:
                - 10.0.12.1/24
                - 2001:db8:12::1/64
        vlan13:
            interfaces:
                - vlan13-phy
            accept-ra: false
            link-local: [ ipv6 ]
            addresses:
                - 10.0.13.1/24
                - 2001:db8:13::1/64

Those aren’t the real VLAN IDs or IP addresses, but you get the idea. Bridge up on the cloud end isn’t strictly necessary, but it does mean we can do other forms of tunnelling if needed.

On the clients, we did something very similar. OpenVPN client config:

client
dev tap
proto udp
remote vpn.example.com 1196
resolv-retry infinite
nobind
persist-key
persist-tun
ca l2-ca.crt
cert l2-client.crt
key l2-client.key
remote-cert-tls server
cipher AES-256-CBC
verb 3
tun-mtu 1518

and for netplan:

network:
    version: 2
    ethernets:
        tap0:
            accept-ra: false
            dhcp4: false
            dhcp6: false
    vlans:
        vlan10-eth:
            id: 10
            link: eth0
        vlan11-eth:
            id: 11
            link: eth0
        vlan12-eth:
            id: 12
            link: eth0
        vlan13-eth:
            id: 13
            link: eth0
        vlan10-vpn:
            id: 10
            link: tap0
        vlan11-vpn:
            id: 11
            link: tap0
        vlan12-vpn:
            id: 12
            link: tap0
        vlan13-vpn:
            id: 13
            link: tap0
    bridges:
        vlan10:
            interfaces:
                - vlan10-vpn
                - vlan10-eth
            accept-ra: false
            link-local: [ ipv6 ]
            addresses:
                - 10.0.10.2/24
                - 2001:db8:10::2/64
        vlan11:
            interfaces:
                - vlan11-vpn
                - vlan11-eth
            accept-ra: false
            link-local: [ ipv6 ]
            addresses:
                - 10.0.11.2/24
                - 2001:db8:11::2/64
        vlan12:
            interfaces:
                - vlan12-vpn
                - vlan12-eth
            accept-ra: false
            link-local: [ ipv6 ]
            addresses:
                - 10.0.12.2/24
                - 2001:db8:12::2/64
        vlan13:
            interfaces:
                - vlan13-vpn
                - vlan13-eth
            accept-ra: false
            link-local: [ ipv6 ]
            addresses:
                - 10.0.13.2/24
                - 2001:db8:13::2/64

I also tried using a Raspberry Pi with Debian, the /etc/network/interfaces config looked like this:

auto eth0
iface eth0 inet dhcp
        mtu 1518

auto tap0
iface tap0 inet manual
        mtu 1518

auto vlan10
iface vlan10 inet static
        address 10.0.10.2
        netmask 255.255.255.0
        bridge_ports tap0.10 eth0.10
iface vlan10 inet6 static
        address 2001:db8:10::2
        netmask 64

auto vlan11
iface vlan11 inet static
        address 10.0.11.2
        netmask 255.255.255.0
        bridge_ports tap0.11 eth0.11
iface vlan11 inet6 static
        address 2001:db8:11::2
        netmask 64

auto vlan12
iface vlan12 inet static
        address 10.0.12.2
        netmask 255.255.255.0
        bridge_ports tap0.12 eth0.12
iface vlan12 inet6 static
        address 2001:db8:12::2
        netmask 64

auto vlan13
iface vlan13 inet static
        address 10.0.13.2
        netmask 255.255.255.0
        bridge_ports tap0.13 eth0.13
iface vlan13 inet6 static
        address 2001:db8:13::2
        netmask 64

Having done this, we had the ability to expand our virtual “L2” network by simply adding more clients on other home Internet connections, the bridges would allow all servers to see each-other as if they were connected to the same Ethernet switch.