Mar 022020
 

This is just a short note… today we finally made the jump across to the NBN. We’re running hybrid-fibre coax here in The Gap, and right now the HFC NTD is running off mains power (getting that onto solar will be a future project).

I already had my OpenBSD 6.6 router running through the ADSL terminating the PPPoE link. The router itself is a PC Engines APU2, which features 3 Ethernet ports. em0 faces my DMZ and internal network. em1 is presently plugged into the ADSL modem/router, and em2 was spare.

Thus, my interface configuration looked like this:

# /etc/hostname.em0
inet 10.20.50.254 255.255.255.0
inet6 alias 2001:44b8:21ac:70f9::fe 64
# … and a stack of !route commands for the internal subnets

# /etc/hostname.em1
up

# /etc/hostname.pppoe0
# pppoedev em1: ADSL
# pppoedev vlan2: NBN
inet 0.0.0.0 255.255.255.255 NONE \
        pppoedev em1 authproto chap \
        authname user@example.com \
        authkey mypassword up

… and of course /etc/pf.conf was configured with appropriate rules for my network. For the NBN, I read up that VLAN #2 was required, so I set up the following:

# /etc/hostname.em2  
up

# /etc/hostname.vlan2                                                                                                
vnetid 2
parent em2
up 

I then changed /etc/hostname.pppoe0 to point to vlan2 instead of em1. When the NBN NTD got installed, I tried this out… no dice, there was PADI frames being sent, but nada, nothing.

Digging around, I needed to set the transmit priority, so I amended /etc/hostname.vlan2:

# /etc/hostname.vlan2                                                                                                
vnetid 2
parent em2
txprio 1    # ← ADD THIS
up 

Bingo! I was now seeing PPPoE traffic. However I wasn’t out of the woods, nothing behind the router was able to get to the Internet. Turns out pf needs to be told what transmit priority to use. I amended my /etc/pf.conf:

# Scrub incoming traffic
match in all scrub (no-df)

# Set pppoe0 priority
match out on $external set prio 1  # ← ADD THIS

# Block all traffic by default (paranoia)
block log all
#block all

That was sufficient to get traffic working. I’m now getting the following out of SpeedTest.

~25Mbps down / ~18Mbps up on Internode HFC NBN

The link is theoretically supposed to be 50Mbps… but whatever. The primary concern is that it didn’t suddenly drop to 0Mbps when the plug got pulled in September. I’ll check again when things are “quieter” (it’ll be peak periods now), but as far as I’m concerned, this is a matter of ensuring continued service.

It already outperforms the ADSL2+ link (which was about 15Mbps / 2Mbps). Next stop will be to port the old telephone number over, but that can wait another day!

Aug 202017
 

OpenNebula is running now… I ended up re-loading my VM with Ubuntu Linux and throwing OpenNebula on that.  That works… and I can debug the issue with Gentoo later.

I still have to figure out corosync/heartbeat for two VMs, the one running OpenNebula, and the core router.  For now, the VMs are only set up to run on one node, but I can configure them on the other too… it’s then a matter of configuring libvirt to not start the instances at boot, and setting up the Linux-HA tools to figure out which node gets to fire up which VM.

The VM hosts are still running Gentoo however, and so far I’ve managed to get them to behave with OpenNebula.  A big part was disabling the authentication in libvirt, otherwise polkit generally made a mess of things from OpenNebula’s point of view.

That, and firewalld had to be told to open up ports for VNC/spice… I allocated 5900-6900… I doubt I’ll have that many VMs.

Last weekend I replaced the border router… previously this was a function of my aging web server, but now I have an ex-RAAF-base Advantech UNO-1150G industrial PC which is performing the routing function.  I tried to set it up with Gentoo, and while it worked, I found it wasn’t particularly stable due to limited memory (it only has 256MB RAM).  In the end, I managed to get OpenBSD 6.1/i386 running sweetly, so for now, it’s staying that way.

While the AMD Geode LX800 is no speed demon, a nice feature of this machine is it’s happy with any voltage between 9 and 32V.

The border router was also given the responsibility of managing the domain: I did this by installing ISC BIND9 from ports and copying across the config from Linux.  This seemed to be working, and so I left it.  Big mistake, turns out bind9 didn’t think it was authoritative, and so refused to handle AXFRs with my slaves.

I was using two different slave DNS providers, puck.nether.net and Roller Network, both at the time of subscription being freebies.  Turns out, when your DNS goes offline, puck.nether.net responds by disabling your domain then emailing you about it.  I received that email Friday morning… and so I wound up in a mad rush trying to figure out why BIND9 didn’t consider itself authoritative.

Since I was in a rush, I decided to tell the border router to just port-forward to the old server, which got things going until I could look into it properly.  It took a bit of tinkering with pf.conf, but eventually got that going, and the crisis was averted.  Re-enabling the domains on puck.nether.net worked, and they stayed enabled.

It was at that time I discovered that Roller Network had decided to make their slave DNS a paid offering.  Fair enough, these things do cost money… At first I thought, well, I’ll just pay for an account with them, until I realised their personal plans were US$5/month.  My workplace uses Vultr for hosting instances of their WideSky platform for customers… and aside from the odd hiccup, they’ve been fine.  US$5/month VPS which can run almost anything trumps US$5/month that only does secondary DNS, so out came the debit card for a new instance in their Sydney data centre.

Later I might use it to act as a caching front-end and as a secondary mail exchanger… but for now, it’s a DIY secondary DNS.  I used their ISO library to install an OpenBSD 6.1 server, and managed to nut out nsd to act as a secondary name server.

Getting that going this morning, I was able to figure out my DNS woes on the border router and got that running, so after removing the port forward entries, I was able to trigger my secondary DNS at Vultr to re-transfer the domain and debug it until I got it working.

With most of the physical stuff worked out, it was time to turn my attention to getting virtual instances working.  Up until now, everything running on the VM was through hand-crafted VMs using libvirt directly.  This is painful and tedious… but for whatever reason, OpenNebula was not successfully deploying VMs.  It’d get part way, then barf trying to set up 802.1Q network interfaces.

In the end, I knew OpenNebula worked fine with bridges that were already defined… but I didn’t want to have to hand-configure each VLAN… so I turned to another automation tool in my toolkit… Ansible:

- hosts: compute
  tasks:
  - name: Configure networking
    template: src=compute-net.j2 dest=/etc/conf.d/net
# …
- hosts: compute
  tasks:
# …
  - name: Add symbolic links (instance VLAN interfaces)
    file: src=net.lo dest=/etc/init.d/net.bond0.{{item}} state=link
    with_sequence: start=128 end=193
  - name: Add symbolic links (instance VLAN bridges)
    file: src=net.lo dest=/etc/init.d/net.vlan{{item}} state=link
    with_sequence: start=128 end=193
# …
  - name: Make services start at boot (instance VLAN bridges)
    command: rc-update add net.vlan{{item}} default
    with_sequence: start=128 end=193 

That’s a snippet of the playbook… and it basically creates symbolic links from Gentoo’s net.lo for all the VLAN ports and bridges, then sets them up to start at boot.

In the compute-net.j2 file referenced above, I put in the following to enumerate all the configuration bits.

# Instance VLANs
{% for vlan in range(128,193) %}
config_vlan{{vlan}}="null"
config_bond0_{{vlan}}="null"
rc_net_vlan{{vlan}}_need="net.bond0.{{vlan}}"
{% endfor %}
# …
vlans_bond0="5 8 10{% for vlan in range(128,193) %} {{vlan}} {% endfor %}248 249 250 251 252"
vlans_bond1="253"
# …
# Instance VLANs
{% for vlan in range(128,193) %}
bridge_vlan{{vlan}}="bond0.{{vlan}}"
{% endfor %} 

The start and end ranges are a little off, but it saved a lot of work.

This naturally took a while for OpenRC to bring up… but it worked. Going back to OpenNebula, I told it what bridges to use, and before long I had my first instance… an OpenBSD router to link my personal VLAN to the DMZ.

I spent a bit of time re-working my routing tables after that… in fact, my network is getting big enough now I have to write some details down.  I spent a few hours documenting the effort:

That’s page 1 of about 15… yes my hand is sore… but at least now should I get run over by a bus, others have a fighting chance doing anything with the network without my technical input.