Mar 052022

So, this week mother nature threw South East Queensland a curve-ball like none of us have seen in over a decade: a massive flood. My workplace, VRT Systems / WideSky.Cloud Pty Ltd resides at 38b Douglas Street, Milton, which is a low-lying area not far from the Brisbane River. Sister company CETA is just two doors down at no. 40. Mid-February, a rain depression developed in the Sunshine Coast Hinterland / Wide Bay area north of Brisbane.

That weather system crawled all the way south, hitting Brisbane with constant heavy rain for 5 days straight… eventually creeping down to the Gold Coast and over the border to the Northern Rivers part of NSW.

The result on our offices was devastating. (Copyright notice: these images are placed here for non-commercial use with permission of the original photographers… contact me if you wish to use these photos and I can forward your request on.)

Some of the stock still worked after the flood — the Siemens UH-40s pictured were still working (bar a small handful) and normally sell for high triple-figures. The WideSky Hubs and CET meters all feature a conformal coating on the PCBs that will make them robust to water ingress and the Wavenis meters are potted sealed units. So not all a loss — but there’s a big pile of expensive EDMI meters (Mk7s and Mk10s) though that are not economic to salvage due to approval requirements which is going to hurt!

Le Mans Motors, pictured in those photos is an automotive workshop, so would have had lots of lubricants, oils and grease in stock needed to service vehicles — much of those contaminants were now across the street, so washing that stuff off the surviving stock was the order of the day for much of Wednesday, before demolition day Friday.

As for the server equipment, knowing that this was a flood-prone area (which also by the way means insurance is non-existent), we deliberately put our server room on the first floor, well above the known flood marks of 1974 and 2011. This flood didn’t get that high, getting to about chest-height on the ground floor. Thus, aside from some desktops, laptops, a workshop (including a $7000 oscilloscope belonging to an employee), a new coffee machine (that hurt the coffee drinkers), and lots of furniture/fittings, most of the IT equipment came through unscathed. The servers “had the opportunity to run without the burden of electricity“.

We needed our stuff working, so we needed to first rescue the machines from the waterlogged building and set them up elsewhere. Elsewhere wound up being at the homes of some of our staff with beefy NBN Internet connections. Okay, not beefy compared to the 500Mbps symmetric microwave link we had, but 50Mbps uplinks were not to be snorted at in this time of need.

The initial plan was the machines that once shared an Ethernet switch, now would be in physically separate locations — but we still needed everything to look like the old network. We also didn’t want to run more VPN tunnels than necessary. Enter OpenVPN L2 mode.

Establishing the VPN server

Up to this point, I had deployed a temporary VPN server as a VPS in a Sydney data centre. This was a plain-jane Ubuntu 20.04 box with a modest amount of disk and RAM, but hopefully decent CPU grunt for the amount of cryptographic operations it was about to do.

Most of our customer sites used OpenVPN tunnels, so I migrated those first — I managed to grab a copy of the running server config as the waters rose before the power tripped out. Copying that config over to the new server, start up OpenVPN, open a UDP port to the world, then fiddled DNS to point the clients to the new box. They soon joined.

Connecting staff

Next problem was getting the staff linked — originally we used a rather aging Cisco router with its VPN client (or vpnc on Linux/BSD), but I didn’t feel like trying to experiment with an IPSec server to replicate that — so up came a second OpenVPN instance, on a new subnet. I got the Engineering team to run the following command to generate a certificate signing request (CSR):

openssl req -newkey rsa:4096 -nodes -keyout <name>.key -out <name>.req

They sent me their .req files, and I used EasyRSA v3 to manage a quickly-slapped-together CA to sign the requests. Downloading them via Slack required that I fish them out of the place where Slack decided to put them (without asking me) and place it in the correct directory. Sometimes I had to rename the file too (it doesn’t ask you what you want to call it either) so it had a .req extension. Having imported the request, I could sign it.

$ mv ~/Downloads/theclient.req pki/reqs/
$ ./easyrsa sign-req client theclient

A new file pki/issued/theclient.crt could then be sent back to the user. I also provided them with pki/ca.crt and a configuration file derived from the example configuration files. (My example came from OpenBSD’s OpenVPN package.)

They were then able to connect, and see all the customer site VPNs, so could do remote support. Great. So far so good. Now the servers.

Server connection VPN

For this, a third OpenVPN daemon was deployed on another port, but this time in L2 mode (dev tap) not L3 mode. In addition, I had servers on two different VLANs, I didn’t want to have to deploy yet more VPN servers and clients, so I decided to try tunnelling 802.1Q. This required boosting the MTU from the default of 1500 to 1518 to accommodate the 802.1Q VLAN tag.

The VPN server configuration looked like this:

port 1196
proto udp
dev tap
ca l2-ca.crt
cert l2-server.crt
key l2-server.key
dh data/dh4096.pem
keepalive 10 120
cipher AES-256-CBC
status /etc/openvpn/l2-clients.txt
verb 3
explicit-exit-notify 1
tun-mtu 1518

In addition, we had to tell netplan to create some bridges, we created a vpn.conf in /etc/netplan/vpn.yaml that looked like this:

    version: 2
        # The VPN tunnel itself
            mtu: 1518
            accept-ra: false
            dhcp4: false
            dhcp6: false
            id: 10
            link: tap0
            id: 11
            link: tap0
            id: 12
            link: tap0
            id: 13
            link: tap0
                - vlan10-phy
            accept-ra: false
            link-local: [ ipv6 ]
                - 2001:db8:10::1/64
                - vlan11-phy
            accept-ra: false
            link-local: [ ipv6 ]
                - 2001:db8:11::1/64
                - vlan12-phy
            accept-ra: false
            link-local: [ ipv6 ]
                - 2001:db8:12::1/64
                - vlan13-phy
            accept-ra: false
            link-local: [ ipv6 ]
                - 2001:db8:13::1/64

Those aren’t the real VLAN IDs or IP addresses, but you get the idea. Bridge up on the cloud end isn’t strictly necessary, but it does mean we can do other forms of tunnelling if needed.

On the clients, we did something very similar. OpenVPN client config:

dev tap
proto udp
remote 1196
resolv-retry infinite
ca l2-ca.crt
cert l2-client.crt
key l2-client.key
remote-cert-tls server
cipher AES-256-CBC
verb 3
tun-mtu 1518

and for netplan:

    version: 2
            accept-ra: false
            dhcp4: false
            dhcp6: false
            id: 10
            link: eth0
            id: 11
            link: eth0
            id: 12
            link: eth0
            id: 13
            link: eth0
            id: 10
            link: tap0
            id: 11
            link: tap0
            id: 12
            link: tap0
            id: 13
            link: tap0
                - vlan10-vpn
                - vlan10-eth
            accept-ra: false
            link-local: [ ipv6 ]
                - 2001:db8:10::2/64
                - vlan11-vpn
                - vlan11-eth
            accept-ra: false
            link-local: [ ipv6 ]
                - 2001:db8:11::2/64
                - vlan12-vpn
                - vlan12-eth
            accept-ra: false
            link-local: [ ipv6 ]
                - 2001:db8:12::2/64
                - vlan13-vpn
                - vlan13-eth
            accept-ra: false
            link-local: [ ipv6 ]
                - 2001:db8:13::2/64

I also tried using a Raspberry Pi with Debian, the /etc/network/interfaces config looked like this:

auto eth0
iface eth0 inet dhcp
        mtu 1518

auto tap0
iface tap0 inet manual
        mtu 1518

auto vlan10
iface vlan10 inet static
        bridge_ports tap0.10 eth0.10
iface vlan10 inet6 static
        address 2001:db8:10::2
        netmask 64

auto vlan11
iface vlan11 inet static
        bridge_ports tap0.11 eth0.11
iface vlan11 inet6 static
        address 2001:db8:11::2
        netmask 64

auto vlan12
iface vlan12 inet static
        bridge_ports tap0.12 eth0.12
iface vlan12 inet6 static
        address 2001:db8:12::2
        netmask 64

auto vlan13
iface vlan13 inet static
        bridge_ports tap0.13 eth0.13
iface vlan13 inet6 static
        address 2001:db8:13::2
        netmask 64

Having done this, we had the ability to expand our virtual “L2” network by simply adding more clients on other home Internet connections, the bridges would allow all servers to see each-other as if they were connected to the same Ethernet switch.

Feb 192022

So, I’ve been wanting to do this for the better part of a decade… but lately, the cost of more capable embedded devices has come right down to make this actually feasible.

It’s taken a number of incarnations, the earliest being the idea of DIYing it myself with a UHF-band analogue transceiver. Then the thought was to pair a I²S audio CODEC with a ESP8266 or ESP32.

I don’t want to rely on technology that might disappear from the market should relations with China suddenly get narky, and of course, time marches on… I learn about protocols like ROC. Bluetooth also isn’t what it was back when I first started down this path — back then A2DP was one-way and sounded terrible, HSP was limited to 8kHz mono audio.

Today, Bluetooth headsets are actually pretty good. I’ve been quite happy with the Logitech Zone Wireless for the most part — the first one I bought had a microphone that failed, but Logitech themselves were good about replacing it under warranty. It does have a limitation though: it will talk to no more than two Bluetooth devices. The USB dongle it’s supplied with, whilst a USB Audio class device, also occupies one of those two slots.

The other day I spent up on a DAB+ radio and a shortwave radio — it’d be nice to listen to these via the same Bluetooth headset I use for calls and the tablet. There are Bluetooth audio devices that I could plug into either of these, then pair with my headset, but I’d have to disconnect either the phone or the tablet to use it.

So, bugger it… the wireless headset interface will get an upgrade. The plan is a small pocket audio swiss-army-knife that can connect to…

  • an analogue device such as a wired headset or radio receiver/transceiver
  • my phone via Bluetooth
  • my tablet via Bluetooth
  • the aforementioned Bluetooth headset
  • a desktop PC or laptop over WiFi

…and route audio between them as needs require.

The device will have a small LCD display for control with a directional joystick button for control, and will be able to connect to a USB host for management.

Proposed parts list

The chip crisis is actually a big limitation, some of the bits aren’t as easily available as I’d like. But, I’ve managed to pull together the following:

The only bit that’s old stock is the LCD, it’s been sitting on my shelf gathering dust for over a decade. Somewhere in one of my junk boxes I’ve got some joystick buttons also bought many years ago.

Proposed software

For the sake of others looking to duplicate my efforts, I’ll stick with Raspberry Pi OS. As my device is an ARMv6 device, I’ll have to stick with the 32-bit release. Not that big a deal, and long-term I’ll probably look at using OpenEmbedded or Gentoo Embedded long-term to make a minimalist image that just does what I need it to do.

The starter kit came with a SD card loaded with NOOBS… I ignored this and just flashed the SD card with a bare minimum Debian Bullseye image. The plan is I’ll get PipeWire up and running on this for its Bluetooth audio interface. Then we’ll try and get the hardware bits going.

Right now, I have the zero booting up, connecting to my local WiFi network, and making itself available via SSH. A good start.

Data sheet for the LCD

The LCD will possibly be one of the more challenging bits. This is from a phone that was new last century! As it happens though, Bergthaller Iulian-Alexandru was kind enough to publish some details on a number of LCD screens. Someone’s since bought and squatted the domain, but The Wayback Machine has an archive of the site.

I’ve mirrored his notes on various Ericsson LCDs here:

The diagrams on that page appear to show the connections as viewed from the front of the LCD panel. I guess if I let magic smoke out, too bad! The alternative is I do have two Nokia 3310s floating around, so harvest the LCDs out of them — in short, I have a fallback plan!

PipeWire on the Pi Zero

This will be the interesting bit. Not sure how well it’ll work, but we’ll give it a shot. The trickiest bit is getting binaries for the device, no one builds for armhf yet. There are these binaries for Ubuntu AMD64, and luckily there are source packages available.

I guess worst case scenario is I put the Pi Zero W aside and get a Pi Zero 2 W instead. Key will be to test PipeWire first before I warm up the soldering iron, let’s at least prove the software side of things, maybe using USB audio devices in place of the AudioInjector board.

I’m going through and building the .debs for armhf myself now, taking notes as I go. I’ll post these when I’m done.

Jan 032022

So, this year I had a new-year’s resolution of sorts… when we first started this “work from home” journey due to China’s “gift”, I just temporarily set up on the dinner table, which was of course, meant to be another few months.

Well, nearly 2 years later, we’re still working from home, and work has expanded to the point that a move to the office, on any permanent basis, is pretty much impossible now unless the business moves to a bigger building. With this in mind, I decided I’d clear off the dinner table, and clean up my room sufficiently to set up my workstation in there.

That meant re-arranging some things, but for the most part, I had the space already. So some stuff I wasn’t using got thrown into boxes to be moved into the garage. My CD collection similarly got moved into the garage (I have it on the computer, but need to retain the physical discs as they represent my “personal use license”), and lo and behold, I could set up my workstation.

The new workspace

One of my colleagues spotted the Indy and commented about the old classic SGI logo. Some might notice there’s also an O2 lurking in the shadows. Those who have known me for a while, will remember I did help maintain a Linux distribution for these classic machines, among others, and had a reasonable collection of my own:

My Indy, O2 and Indigo2 R10000
The Octane, booting up

These machines were all eBay purchases, as is the Sun monitor pictured (it came with the Octane). Sadly, fast forward a number of years, and these machines are mostly door stops and paperweights now.

The Octane’s and Indigo2’s demises

The Octane died when I tried to clean it out with a vacuum cleaner, without realising the effect of static electricity generated by the vacuum cleaner itself. I might add mine was a particularly old unit: it had a 175MHz R10000 CPU, and I remember the Linux kernel didn’t recognise the power management circuitry in the PSU without me manually patching it.

The Indigo2 mysteriously stopped working without any clear reason why, I’ve never actually tried to diagnose the issue.

That left the Indy and the O2 as working machines. I haven’t fired them up in a long time until today. I figured, what the hell, do they still run?

Trying the Indy and O2 out

Plug in the Indy, hit the power button… nothing. Dead as a doornail. Okay, put it aside… what about the O2?

I plug it in, shuffle it closer to the monitor so I can connect it. ‘Lo and behold:

The O2 lives!

Of course, the machine was set up to use a serial console as its primary interface, and boot-up was running very slow.

Booting up… very slowly…

It sat there like that for a while, figuring the action was happening on a serial port, I went to go get my null modem cable, only to find a log-in prompt by the time I got back.

Next was remembering what password I was using when I last ran this machine. We had the OpenSSL heartbleed vulnerability happen since then, and at about that time, I revoked all OpenPGP keys and changed all passwords, so it isn’t what I use today. I couldn’t get in as root, but my regular user account worked, and I was able to change the root password via sudo.

Remembering my old log-in credentials, from 22 years ago it seems

The machine soon crashed after that. I tried rebooting, this time I tweaked some PROM settings (and yes, I was rusty remembering how to do it) to be able to see what was going. (I had the null modem cable in hand, but didn’t feel like trying to blindly find the serial port at the back of my desktop.)

Changing PROM settings
The subsequent boot, and crash

Evidently, I had a dud disk. This did not surprise me in the slightest. I also noticed the PSU fan was not spinning, possibly seized after years of non-use.

Okay, there were two disks installed in this machine, both 80-pin SCA SCSI drives. Which one was it? I took a punt and tried the one furtherest away from the I/O ports.

Success, she boots now

I managed to reset the root password, before the machine powered itself off (possibly because of overheating). I suspect the machine will need the dust blown out of it (safely! — not using the method that killed the Octane!), and the HDDs will need replacements. The guilty culprit was this one (which I guessed correctly first go):

a 4GB HDD was a big drive back in 1998!

The computer I’m typing this on has a HDD that stores 1000 of these drives. Today, there are modern alternatives, such as SCSI2SD that could get this machine running fully if needed. The tricky bit would be handling the 80-pin hot-swap interface. There’d be some hardware hacking needed to connect the two, but AU$145 plus an adaptor seems like a safer bet than trying some random used HDD.

So, replacement for the HDDs, a clean-out, and possibly a new fan or two, and that machine will be back to “working” state. Of course the Linux landscape has moved on since then, Debian no longer support the MIPS4 ISA that the RM5200 CPU understands, Gentoo still could work on this though, and maybe OpenBSD still support this too. In short, this machine is enough of a “go-er” that it should not be sent to land-fill… yet.

Turning my attention back to the Indy

So the Indy was my first SGI machine. Bought to better understand the MIPS processor architecture, and perhaps gain enough understanding to try and breathe life into a Cobalt Qube II server appliance (remember those?), it did teach me a lot about Linux and how things vary between platforms.

I figured I might as well pop the cover and see if there’s anything “obviously” wrong. The procedure I was rusty on, but I recalled there was a little catch on the back of the case that needed to be release before the cover slid off. So I lugged the 20″ CRT off the top of the machine, pulled the non-functioning Indy out, and put it on the table to inspect further.

Upon trying to pop the cover (gently!), the top of the case just exploded. Two pieces of the top cover go flying, and the RF shield parts company with the cover’s underside.

The RF shield parted company with the underside of the lid

I was left with a handful of small plastic fragments that were the heat-set posts holding the RF shield to the inside of the lid.

Some of the fragments that once held the RF shield in place

Clearly, the plastic has become brittle over the years. These machines were released in 1993, I think this might be a 1994-model as it has a slightly upgraded R4600 CPU in it.

As to the machine itself, I had a quick sticky-beak, there didn’t seem to be any immediately obvious things, but to be honest, I didn’t do a very thorough check. Maybe there’s some corrosion under the motherboard I didn’t spot, maybe it’s just a blown fuse in the PSU, who knows?

The inside of the Indy

This particular machine had 256MB RAM (a lot for its day), 8-bit Newport XL graphics, the “Indy Presenter” LCD interface (somewhere, we have the 15″ monitor it came with — sadly the connecting cable has some damaged conductors), and the HDD is a 9.1GB HDD I added some time back.

Where to now?

I was hanging on to these machines with the thinking that someone who was interested in experimenting with RISC machines might want them — find them a new home rather than sending them to landfill. I guess that’s still an option for the O2, as it still boots: so long as its remaining HDD doesn’t die it’ll be fine.

For the others, there’s the possibility of combining bits to make a functional frankenmachine from lots of parts. The Indy will need a new PROM battery if someone does manage to breathe life into it.

The Octane had two SCSI interfaces, one of which was dead — a problem that was known-of before I even acquired it. The PROM would bitch and moan about the dead SCSI interface for a good two minutes before giving up and dumping you in the boot menu. Press 1, and it’d hand over to arcload, which would boot a Linux kernel from a disk on the working controller. Linux would see the dead SCSI controller, and proceed to ignore it, booting just fine as if nothing had happened.

The Indigo2 R10000 was always the red-hedded step child: an artefact of the machine’s design. The IP22 design (Indy and Indigo2) was never designed with the intent of being used with a R10000 CPU, and the speculative execution features played havoc with caching on this design. The Octane worked fine because it was designed from the outset to run this CPU. The O2 could be made to work because of a quirk of its hardware design, but the Indigo2 was not so flexible, so kernel-space code had to hack around the problem in software.

I guess I’d still like to see the machines go to a good home, but no idea who that’d be in this day and age. Somewhere, I have a partial disc set of Irix 6.5, and there’s also a 20″ SGI GDM5410 monitor (not the Sun monitor pictured above) that, at last check, did still work.

It’ll be a sad day when these go to the tip.

Oct 072021

Recently, I noticed my network monitoring was down… I hadn’t worried about it because I had other things to keep me busy, and thankfully, my network monitoring, whilst important, isn’t mission critical.

I took a look at it today. The symptom was an odd one, influxd was running, it was listening on the back-up/RPC port 8088, but not 8086 for queries.

It otherwise was generating logs as if it were online. What gives?

Tried some different settings, nothing… nada… zilch. Nothing would make it listen to port 8086.

Tried updating to 1.8 (was 1.1), still nothing.

Tried manually running it as root… sure enough, if I waited long enough, it started on its own, and did begin listening on port 8086. Hmmm, I wonder. I had a look at the init scripts:

#!/bin/bash -e

/usr/bin/influxd -config /etc/influxdb/influxdb.conf $INFLUXD_OPTS &
echo $PID > /var/lib/influxdb/

BIND_ADDRESS=$(influxd config | grep -A5 "\[http\]" | grep '^  bind-address' | cut -d ' ' -f5 | tr -d '"')
HTTPS_ENABLED_FOUND=$(influxd config | grep "https-enabled = true" | cut -d ' ' -f5)
if [ $HTTPS_ENABLED = "true" ]; then
  HTTPS_CERT=$(influxd config | grep "https-certificate" | cut -d ' ' -f5 | tr -d '"')
  if [ ! -f "${HTTPS_CERT}" ]; then
    echo "${HTTPS_CERT} not found! Exiting..."
    exit 1
  echo "$HTTPS_CERT found"

set +e
result=$(curl -k -s -o /dev/null $url -w %{http_code})
while [ "$result" != "200" ]; do
  sleep 1
  result=$(curl -k -s -o /dev/null $url -w %{http_code})
  if [ $max_attempts -le 0 ]; then
    echo "Failed to reach influxdb $PROTOCOL endpoint at $url"
    exit 1
set -e

Ahh right, so start the server, check every second to see if it’s up, and if not, just abort and let systemd restart the whole shebang. Because turning the power on-off-on-off-on-off is going to make it go faster, right?

I changed max_attempts to 360 and the sleep to 10.

Having fixed this, I am now getting data back into my system.

Jun 092021

So, I finally had enough with the Epson WF7510 we have which is getting on in years, occasionally miss-picks pages, won’t duplex, and has a rather curious staircase problem when printing. We’ll keep it for A3 scanning and printing (the fax feature is now useless), but for a daily driver, I decided to make an end-of-financial-year purchase. I wanted something that met this criteria:

  • A4 paper size
  • Automatic duplex printing
  • Networked
  • Laser/LED (for water-resistant prints)
  • Colour is a “nice to have”

I looked at the mono options, but when I looked at the driver options for Linux, things were looking dire with binary blobs everywhere. Removed the restriction on it being mono, and suddenly this option appeared that was cheaper, and more open. I didn’t need a scanner (the WF7510’s scanner works fine with xsane, plus I bought a separate Canon LiDE 300 which is pretty much plug-and-play with xsane), a built-in fax is useless since we can achieve the same using hylafax+t38modem (a TO-DO item well down in my list of priorities).

The Kyocera P5021cdn allegedly isn’t the cheapest to run, but it promised a fairly pain-free experience on Linux and Unix. I figured I’d give it a shot. These are some notes I used to set the thing up. I want to move it to a different part of the network ultimately, but we’ll see what the cretinous Windows laptop my father users will let us do, for now it shares that Ethernet VLAN with the WF7510 and his laptop, and I’ll just hop over the network to access it.

Getting the printer’s IP and MAC address

The menu on the printer does not tell you this information. There is however, a Printer Status menu item in the top-panel menu. Tell it to print the status page, you’ll get a nice colour page with lots of information about the printer including its IPv4 and IPv6 addresses.

Web interface

If you want to configure the thing further, you need a web browser. Visit the printer’s IP address in your browser and you’re greeted with Command Centre RX. Out of the box, the username and password were Admin and Admin (capitalised A).

Setting up CUPS

The printer “driver” off the Kyocera website is a massive 400MB zip file, because they bundled up .deb and .rpm packages for every distribution they officially support together in one file. Someone needs to introduce them to reprepro and its dnf-equivalent. That said, you have a choice… if you pick a random .deb out of that blob, and manually unpack it somewhere (use ar x on it, you’ll see data.tar.xz or something, unpack that and you’ve got your package files), you’ll find a .ppd file you’ll need.

Or, you can do a search and realise that the Arch Linux guys have done the hard work for you. Many thanks guys (and girls… et all)!

Next puzzle is figuring out the printer URI. Turns out the printer calls itself lp1… so the IPP URI you should use is http://<IP>:631/ipp/lp1.

I haven’t put the thing fully through its paces, and I note the cartridges are down about 4% from those two prints (the status page and the CUPS test print), but often the initial cartridges are just “starter” cartridges and that the replacements often have a lot more toner in them. I guess time will tell on their longevity (and that of the imaging drum).

May 262020

Lately, I’ve been socially distancing a home and so there’s been a few projects that have been considered that otherwise wouldn’t ordinarily get a look in on a count of lack-of-time.

One of these has been setting up a Raspberry Pi with DRAWS board for use on the bicycle as a radio interface. The DRAWS interface is basically a sound card, RTC, GPS and UART interface for radio interfacing applications. It is built around the TI TMS320AIC3204.

Right now, I’m still waiting for the case to put it in, even though the PCB itself arrived months ago. Consequently it has not seen action on the bike yet. It has gotten some use though at home, primarily as an OpenThread border router for 3 WideSky hubs.

My original idea was to interface it to Mumble, a VoIP server for in-game chat. The idea being that, on events like the Yarraman to Wulkuraka bike ride, I’d fire up the phone, connect it to an AP run by the Raspberry Pi on the bike, and plug my headset into the phone:144/430MHz→2.4GHz cross-band.

That’s still on the cards, but another use case came up: digital. It’d be real nice to interface this over WiFi to a stronger machine for digital modes. Sound card over network sharing. For this, Mumble would not do, I need a lossless audio transport.

Audio streaming options

For audio streaming, I know of 3 options:

  • PulseAudio network streaming
  • netjack
  • trx

PulseAudio I’ve found can be hit-and-miss on the Raspberry Pi, and IMO, is asking for trouble with digital modes. PulseAudio works fine for audio (speech, music, etc). It will make assumptions though about the nature of that audio. The problem is we’re not dealing with “audio” as such, we’re dealing with modem tones. Human ears cannot detect phase easily, data modems can and regularly do. So PA is likely to do things like re-sample the audio to synchronise the two stations, possibly use lossy codecs like OPUS or CELT, and make other changes which will mess with the signal in unpredictable ways.

netjack is another possibility, but like PulseAudio, is geared towards low-latency audio streaming. From what I’ve read, later versions use OPUS, which is a no-no for digital modes. Within a workstation, JACK sounds like a close fit, because although it is geared to audio, its use in professional audio means it’s less likely to make decisions that would incur loss, but it is a finicky beast to get working at times, so it’s a question mark there.

trx was a third option. It uses RTP to stream audio over a network, and just aims to do just that one thing. Digging into the code, present versions use OPUS, older versions use CELT. The use of RTP seemed promising though, it actually uses oRTP from the Linphone project, and last weekend I had a fiddle to see if I could swap out OPUS for linear PCM. oRTP is not that well documented, and I came away frustrated, wondering why the receiver was ignoring the messages being sent by the sender.

It’s worth noting that trx probably isn’t a good example of a streaming application using oRTP. It advertises the stream as G711u, but then sends OPUS data. What it should be doing is sending it as a dynamic content type (e.g. 96), and if this were a SIP session, there’d be a RTPMAP sent via Session Description Protocol to say content type 96 was OPUS.

I looked around for other RTP libraries to see if there was something “simpler” or better documented. I drew a blank. I then had a look at the RTP/RTCP specs themselves published by the IETF. I came to the conclusion that RTP was trying to solve a much more complicated use case than mine. My audio stream won’t traverse anything more sophisticated than a WiFi AP or an Ethernet switch. There’s potential for packet loss due to interference or weak signal propagation between WiFi nodes, but latency is likely to remain pretty consistent and out-of-order handling should be almost a non-issue.

Another gripe I had with RTP is its almost non-consideration of linear PCM. PCMA and PCMU exist, 16-bit linear PCM at 44.1kHz sampling exists (woohoo, CD quality), but how about 48kHz? Nope. You have to use SDP for that.

Custom protocol ideas

With this in mind, my own custom protocol looks like the simplest path forward. Some simple systems that used by GQRX just encapsulate raw audio in UDP messages, fire them at some destination and hope for the best. Some people use TCP, with reasonable results.

My concern with TCP is that if packets get dropped, it’ll try re-sending them, increasing latency and never quite catching up. Using UDP side-steps this, if a packet is lost, it is forgotten about, so things will break up, then recover. Probably a better strategy for what I’m after.

I also want some flexibility in audio streams, it’d be nice to be able to switch sample rates, bit depths, channels, etc. RTP gets close with its L16/44100/2 format (the Philips Red-book standard audio format). In some cases, 16kHz would be fine, or even 8kHz 16-bit linear PCM. 44.1k works, but is wasteful. So a header is needed on packets to at least describe what format is being sent. Since we’re adding a header, we might as well set aside a few bytes for a timestamp like RTP so we can maintain synchronisation.

So with that, we wind up with these fields:

  • Timestamp
  • Sample rate
  • Number of channels
  • Sample format


The timestamp field in RTP is basically measured in ticks of some clock of known frequency, e.g. for PCMU it is a 8kHz clock. It starts with some value, then increments up monotonically. Simple enough concept. If we make this frequency the sample rate of the audio stream, I think that will be good enough.

At 48kHz 16-bit stereo; data will be streaming at 192kbps. We can tolerate wrap-around, and at this data rate, we’d see a 16-bit counter overflow every ~341ms, which whilst not unworkable, is getting tight. Better to use a 32-bit counter for this, which would extend that overflow to over 6 hours.

Sample rate encoding

We can either support an integer field, or we can encode the rate somehow. An integer field would need a range up to 768k to support every rate ALSA supports. That’s another 32-bit integer. Or, we can be a bit clever: nearly every sample rate in common use is a harmonic of 8kHz or 11.025kHz, so we devise a scheme consisting of a “base” rate and multiplier. 48kHz? That’s 8kHz×6. 44.1kHz? That’s 11.025kHz×4.

If we restrict ourselves to those two base rates, we can support standard rates from 8kHz through to 1.4MHz by allocating a single bit to select 8kHz/11.025kHz and 7 bits for the multiplier: the selected sample rate is the base rate multiplied by the multipler incremented by one. We’re unlikely to use every single 8kHz step though. Wikipedia lists some common rates and as we go up, the steps get bigger, so let’s borrow 3 multiplier bits for a left-shift amount.

7 6 5 4 3 2 1 0

B = Base rate: (0) 8000 Hz, (1) 11025 Hz
S = Shift amount
M = Multiplier - 1

Rate = (Base << S) * (M + 1)

  00000000b (0x00): 8kHz
  00010000b (0x10): 16kHz
  10100000b (0xa0): 44.1kHz
  00100000b (0x20): 48kHz
  01010010b (0x52): 768kHz (ALSA limit)
  11111111b (0xff): 22.5792MHz (yes, insane)

Other settings

I primarily want to consider linear PCM types. Technically that includes unsigned PCM, but since that’s losslessly transcodable to signed PCM, we could ignore it. So we could just encode the number of bytes needed for a single channel sample, minus one. Thus 0 would be 8-bits; 1 would be 16-bits; 2 would be 32-bits and 3 would be 64-bits. That needs just two bits. For future-proofing, I’d probably earmark two extra bits; reserved for now, but might be used to indicate “compressed” (and possibly lossy) formats.

The remaining 4 bits could specify a number of channels, again minus 1 (mono would be 0, stereo 1, etc up to 16).

Packet type

For the sake of alignment, I might include a 16-bit identifier field so the packet can be recognised as being this custom audio format, and to allow multiplexing of in-band control messages, but I think the concept is there.

May 122020

So, the other day I pondered about whether BlueTrace could be ported to an older device, or somehow re-implemented so it would be compatible with older phones.

The Australian Government has released their version of TraceTogether, COVIDSafe, which is available for newer devices on the Google and Apple application repositories. It suffers a number of technical issues, one glaring one being that even on devices it theoretically supports, it doesn’t work properly unless you have it running in the foreground and your phone unlocked!

Well, there’s a fail right there! Lots of people, actually need to be able to lock their phones. (e.g. a condition of their employment, preventing pocket dials, saving battery life, etc…)

My phone, will never run COVIDSafe, as provided. Even compiling it for Android 4.1 won’t be enough, it uses Bluetooth Low Energy, which is a Bluetooth 4.0 feature. However, the government did one thing right, they have published the source code. A quick fish-eye over the diff against TraceTogether, suggests the changes are largely superficial.

Interestingly, although the original code is GPLv3, our government has decided to supply their own license. I’m not sure how legal that is. Others have questioned this too.

So, maybe I can run it after all? All I need is a device that can do BLE. That then “phones home” somehow, to retrieve tokens or upload data. Newer phones (almost anything Android-based) usually can do WiFi hotspot, which would work fine with a ESP32.

Older phones don’t have WiFi at all, but many can still provide an Internet connection over a Bluetooth link, likely via the LAN Access Profile. I think this would mean my “token” would need to negotiate HTTPS itself. Not fun on a MCU, but I suspect someone has possibly done it already on ESP32.

Nordic platforms are another option if we go the pure Bluetooth route. I have two nRF52840-DK boards kicking around here, bought for OpenThread development, but not yet in use. A nicety is these do have a holder for a CR2032 cell, so can operate battery-powered.

Either way, I think it important that the chosen platform be:

  1. easily available through usual channels
  2. cheap
  3. hackable, so the devices can be re-purposed after this COVID-19 nonsense blows over

A first step might be to see if COVIDSafe can be cleaved in two… with the BLE part running on a ESP32 or nRF52840, and the HTTPS part running on my Android phone. Also useful, would be some sort of staging server so I can test my code without exposing things. Not sure if there is such a beast publicly available that we can all make use of.

Guess that’ll be the next bit to look at.

May 032020

This afternoon, I was pondering about how I might do text-to-speech, but still have the result sound somewhat natural. For what use case? Well, two that come to mind…

The first being for doing “strapper call” announcements at horse endurance rides. A horse endurance ride is where competitors and their horses traverse a long (sometimes as long as 320km) trail through a wilderness area. Usually these rides (particularly the long ones) are broken up into separate stages or “legs”.

Upon arrival back at base, the competitor has a limited amount of time to get the horse’s vital signs into acceptable ranges before they must present to the vet. If the horse has a too-high temperature, or their horse’s heart rate is too high, they are “vetted out”.

When the competitor reaches the final check-point, ideally you want to let that competitor’s support team know they’re on their way back to base so they can be there to meet the competitor and begin their work with the horse.

Historically, this was done over a PA system, however this isn’t always possible for the people at base to achieve. So having an automated mechanism to do this would be great. In recent times, Brisbane WICEN has been developing a public display that people can see real-time results on, and this also doubles as a strapper-call display.

Getting the information to that display is something of a work-in-progress, but it’s recognised that if you miss the message popping up on the display, there’s no repeat. A better solution would be to “read out” the message. Then you don’t have to be watching the screen, you can go about your business. This could be done over a PA system, or at one location there’s an extensive WiFi network there, so streaming via Icecast is possible.

But how do you get the text into speech?

Enter flite

flite is a minimalist speech synthesizer from the Festival project. Out of the box it includes 3 voices, mostly male American voices. (I think the rms one might be Richard M. Stallman, but I could be wrong on that!) There’s a couple of demos there that can be run direct from the command line.

So, for the sake of argument, let’s try something simple, I’ll use the slt voice (a US female voice) and just get the program to read out what might otherwise be read out during a horse ride event:

$ flite_cmu_us_slt -t 'strapper call for the 160 kilometer event competitor numbers 123 and 234' slt-strapper-nopunctuation-digits.wav

Not bad, but not that great either. Specifically, the speech is probably a little quick. The question is, how do you control this? Turns out there’s a bit of hidden functionality.

There is an option marked -ssml which tells flite to interpret the text as SSML. However, if you try it, you may find it does little to improve matters, I don’t think flite actually implements much of it.

Things are improved if we spell everything out. So if you instead replace the digits with words, you do get a better result:

$ flite_cmu_us_slt -t 'strapper call for the one hundred and sixty kilometer event competitor number one two three and two three four' slt-strapper-nopunctuation-words.wav

Definitely better. It could use some pauses. Now, we don’t have very fine-grained control over those pauses, but we can introduce some punctuation to have some control nonetheless.

$ flite_cmu_us_slt -t 'strapper call.  for the one hundred and sixty kilometer event.  competitor number one two three and two three four' slt-strapper-punctuation.wav

Much better. Of course it still sounds somewhat robotic though. I’m not sure how to adjust the cadence on the whole, but presumably we can just feed the text in piece-wise, render those to individual .wav files, then stitch them together with the pauses we want.

How about other changes though? If you look at flite --help, there is feature options which can control the synthesis. There’s no real documentation on what these do, what I’ve found so far was found by grep-ing through the flite source code. Tip: do a grep for feat_set_, and you’ll see a whole heap.

Controlling pitch

There’s two parameters for the pitch… int_f0_target_mean controls the “centre” frequency of the speech in Hertz, and int_f0_target_stddev controls the deviation. For the slt voice, …mean seems to sit around 160Hz and the deviation is about 20Hz.

So we can say, set the frequency to 90Hz and get a lower tone:

$ flite_cmu_us_slt --setf int_f0_target_mean=90 -t 'strapper call' slt-strapper-mean-90.wav

… or 200Hz for a higher one:

$ flite_cmu_us_slt --setf int_f0_target_mean=200 -t 'strapper call' slt-strapper-mean-200.wav

… or we can change the variance:

$ flite_cmu_us_slt --setf int_f0_target_stddev=0.0 -t 'strapper call' slt-strapper-stddev-0.wav
$ flite_cmu_us_slt --setf int_f0_target_stddev=70.0 -t 'strapper call' slt-strapper-stddev-70.wav

We can’t change these values during a block of speech, but presumably we can cut up the text we want to render, render each piece at the frequency/variance we want, then stitch those together.

Controlling rate

So I mentioned we can control the rate, somewhat coarsely using usual punctuation devices. We can also change the rate overall by setting duration_stretch. This basically is a control of how “long” we want to stretch out the pronunciation of words.

$ flite_cmu_us_slt --setf duration_stretch=0.5 -t 'strapper call' slt-strapper-stretch-05.wav
$ flite_cmu_us_slt --setf duration_stretch=0.7 -t 'strapper call' slt-strapper-stretch-07.wav
$ flite_cmu_us_slt --setf duration_stretch=1.0 -t 'strapper call' slt-strapper-stretch-10.wav
$ flite_cmu_us_slt --setf duration_stretch=1.3 -t 'strapper call' slt-strapper-stretch-13.wav
$ flite_cmu_us_slt --setf duration_stretch=2.0 -t 'strapper call' slt-strapper-stretch-20.wav

Putting it together

So it looks as if all the pieces are there, we just need to stitch them together.

RC=0 stuartl@rikishi /tmp $ flite_cmu_us_slt --setf duration_stretch=1.2 --setf int_f0_target_stddev=50.0 --setf int_f0_target_mean=180.0 -t 'strapper call' slt-strapper-call.wav
RC=0 stuartl@rikishi /tmp $ flite_cmu_us_slt --setf duration_stretch=1.1 --setf int_f0_target_stddev=30.0 --setf int_f0_target_mean=180.0 -t 'for the, one hundred, and sixty kilometer event' slt-160km-event.wav
RC=0 stuartl@rikishi /tmp $ flite_cmu_us_slt --setf duration_stretch=1.4 --setf int_f0_target_stddev=40.0 --setf int_f0_target_mean=180.0 -t 'competitors, one two three, and, two three four' slt-competitors.wav
Above files stitched together in Audacity

Here, I manually imported all three files into Audacity, arranged them, then exported the result, but there’s no reason why the same could not be achieved by a program, I’m just inserting pauses after all.

There are tools for manipulating RIFF waveform files in most languages, and generating silence is not rocket science. The voice itself could be fine-tuned, but that’s simply a matter of tweaking settings. Generating the text is basically a look-up table feeding into snprintf (or its equivalent in your programming language of choice).

It’d be nice to implement a wrapper around flite that took the full SSML or JSML text and rendered it out as speech, but this gets pretty close without writing much code at all. Definitely worth continuing with.

Apr 192020

COVID-SARS-2 is a nasty condition caused by COVID-19 that has seen many a person’s life cut short. The COVID-19 virus which originated from Wuhan, China has one particularly insidious trait: it can be spread by asymptomatic people. That is, you do not have to be suffering symptoms to be an infectious carrier of the condition.

As frustrating as isolation has been, it’s really our only viable solution to preventing this infectious condition from spreading like wildfire until we get a vaccine that will finally knock it on the head.

One solution that has been proposed has been to use contract tracing applications which rely on Bluetooth messaging to detect when an infected person comes into contact with others. Singapore developed the TraceTogether application. The Australian Government look like they might be adopting this application, our deputy CMO even suggesting it’d be made compulsory (before the PM poured water on that plan).

Now, the Android version of this, requires Android 5.1. My phone runs 4.1: I cannot run this application. Not everybody is in the habit of using Bluetooth, or even carries a phone. This got me thinking: can this be implemented in a stand-alone device?

The guts of this application is a protocol called BlueTrace which is described in this whitepaper. Reference implementations exist for Android and iOS.

I’ll have to look at the nitty-gritty of it, but essentially it looks like a stand-alone implementation on a ESP32 module maybe a doable proposition. The protocol basically works like this:

  • Clients register using some contact details (e.g. a telephone number) to a server, which then issues back a “user ID” (randomised).
  • The server then uses this to generate “temporary IDs” which are constructed by concatenating the “User ID” and token life-time start/finish timestamps together, encrypting that with the secret key, then appending the IV and an authentication token. This BLOB is then Base64-encoded.
  • The client pulls down batches of these temporary IDs (forward-dated) to use for when it has no Internet connection available.
  • Clients, then exchange these temporary IDs using BLE messaging.

This, looks doable in an ESP32 module. The ESP32 could be loaded up with tokens by a workstation. You then go about your daily business, carrying this device with you. When you get home, you plug the device into your workstation, and it uploads the “temporary IDs” it saw.

I’ll have to dig out my ESP32 module, but this looks like a doable proposition.

Apr 112020

So, for years… decades even, our telephone service has been via the Public Switched Telephone Network. Originally intended for just voice traffic, this later became our Internet connection, using dial-up modems, then using ADSL.

The number itself gained two digits in its life-time: originally 6 digits (late 70s/early 80s), it gained a 0 at some point, then in 1996 a 3 was prepended to all Brisbane numbers.

So yeah, we’ve had our phone a long time, and the underlying technology has remained largely the same for that time period. Even the handset is the same one from all those years ago. Telecom Australia used to pay for it as a “priority service” back then as my father was working for them at the time.

By September, this will change. The National Broadband Network is in our suburb, and a little while back I migrated the ADSL2+ connection over to HFC. After a brief hiccup getting OpenBSD talking to it, we were away. It’s been pretty stable so far. Stable enough now that I haven’t had the ADSL modem connected in weeks.

At the time of migration, I could have migrated the telephone too to NBN phone, however there’s a snag: how do you access NBN phone? The answer is the ISP sends you a pre-configured ATA+Router+WiFi AP (and sometimes there’s an ADSL modem in there too). As I had decided to use my own router, I didn’t have such a device.

I asked Internode about the SIP details, and the situation is this: when they provision a NBN connection, there’s an automated process that configures their VoIP service then stores the credentials and other settings in a ACS server. They have no visibility of these credentials at all. Ordinarily, if I had purchased hardware, they would have provisioned that with credentials that would allow it to authenticate over the TR069 protocol. So without spending $200 on a device I wasn’t going to use, I wasn’t getting these credentials.

There’s another option though: VoIP. The NBN phone is actually a VoIP service, so fundamentally nothing much changes. Keeping the services separate though does give me flexibility in the future. There’s a list of VoIP providers as long as my arm that I can go with.

VoIP is notoriously difficult to set up though, I didn’t want to mess up our only incoming telephone service, so I decided to migrate the NBN phone number over to Internode’s NodePhone VoIP service which would allow me to experiment.


So, first thing to consider is what my needs are. We have 4 devices that are plugged into the telephone line at present:

  • Telecom Australia Touchfone 200 wired telephone
  • Telstra 9200a cordless telephone base-station
  • Epson WF-7510 printer/scanner/fax
  • Maestro Jetstream 56kbps modem

Now, we don’t send many faxes (maybe two a year), and the last time that modem was used, it was to dial into a weighbridge system in Rockhampton after my workplace had moved to VoIP.

That weighbridge system was originally built in 1995 on top of computers running SCO OpenServer 5 and using SCO UUCP over dial-up lines running at 1200 baud (some of the nodes were at remote sites with dodgy phone lines). In 2013 the core servers were upgraded with new hardware and Ubuntu 12.04LTS, but the UUCP links remained as the SCO boxes at sites were slowly replaced. mgetty would answer the phone, and certain user accounts would use uucico as the “shell”. For admin purposes, we could log in, and that would give us a BASH prompt.

Any time we had a support issue at work, muggins would be the one to literally “dial in” to that site. While it’s been years since I’ve needed to touch it (and I think now there’s a VPN to that site), I wanted to retain dial-up capability if needed.

As for the faxes… the stuff we’re doing can be done over email. Getting the modem and the fax machine working is a stretch goal.

99% of the traffic will be voice traffic. I’m not sure if it’s possible to use double-adaptors with an ATA, and so for hardware I opted for the Grandstream HT814. These are available domestically (e.g. MyITHub) for under AU$130 and looked to be pretty good bang-per-buck. I just wasn’t sure how well it’d get along with the old T200.

As a contingency plan, I also ordered an IP phone, a Grandstream GXP1615 which is also available locally. That way if the T200 gave me problems, I’d just wire up Ethernet to the old point where the T200 was and put the GXP1615 there.

Since I’ve got two SIP devices, I need to be able to control incoming and outgoing call flows. So that means running a soft-PBX somewhere. Asterisk is the obvious choice here, being a free-software VoIP package with lots of flexibility. OpenBSD 6.6 ships with Asterisk 16.6.2.

Opening the account

Opening the account is simple enough. As I was porting the NBN phone number over, I just needed some details off my last Internode bill. Later when I go to do the house phone some time next financial year, I’ll be after a Telstra bill (assuming Telstra Wholesale have sorted out their CoVID-19 issues by then).

I looked at the hardware options that Internode offered… and again, it was basically the same as what they offer for their Internet service.

One thing I note is that while Internode has been a great ISP (I switched to them in 2012), their credentials as a VSP are still developing somewhat.

Initial connection

Provisioning was much less smooth than my initial ADSL connection (which was practically seamless) or the subsequent move to HFC NBN. The first bump in the road was when they emailed me:

Subject: Internode: Your Internode NodePhone VoIP service xxxxxxxxxx can’t make/receive calls yet [xxxxxxxxx]

Following activation of your NBN service, we ran some tests to make sure things were running smoothly.

We weren’t able to detect that your NodePhone VoIP service (phone number xxxxxxxxxx) has been set up successfully.

Initial contact regarding the telephone service…

Okay, fair enough, on the date I received that email I had only just received the ATA that I had purchased (through another supplier). Evidently they thought I had the hardware already. I found some time and quickly cobbled together a set-up:

First test set-up: siproxd on the border router to the HT814

I did some quick research, rather than doing NAT, I figured siproxd was closer to my eventual goals, so I installed that on the border router as there were fewer knobs and dials to deal with. I configured the HT814 with the settings as best I understood them from Internode’s guides with one difference: I set the Proxy field to my border router’s internal IP address.

This failed with Internode’s server giving me a 404: Not Found response when my HT814 sent its REGISTER request. I took some captures using tshark and reported this back to their helpdesk. I disconnected the ATA since there was no sense in banging on their front door every 20 seconds: it wasn’t working.

They got back to me a few days later and told me I had the right user name, and that my number still wasn’t registered. Plugging the ATA in, same response, 404: Not Found. In exasperation, I tried some changes:

  • Different settings on the HT814 (too many to list)
  • Trying with Twinkle on my laptop connected to the DMZ, both through via siproxd and also shutting down siproxd and setting up NAT.
  • Installing Asterisk on the border router (which was in the plans) and trying to set that up.

All three hit the same problem, 404: Not Found. I figured either I had entered the same wrong details 3 times, or it was definitely their end. So I reported that, unplugged the ATA and waited. This was on the 27th March.

On the 31st March, I get an email from Internode Provisioning to say they would be porting the number soon. After receiving this email, REGISTER worked, I was getting 200: OK in reply. Funnily enough, there was no challenge to credentials, it just saw my end and said “OK, you’re in”.

Calling outbound after this worked: the call trace showed the Internode end challenging the ATA for credentials, but once supplied, it connected the call, all worked. Inbound calls were a different matter though, a recorded (female) voice announced that the number was invalid. We were half-way there.

The next day, that message changed, it now was a recorded (male) voice announcing the number was unavailable, and offering to leave a voice-mail message. Progress. I had not changed my end, I still had T200 → HT814 → siproxd on the border router as my set-up. I was not seeing any incoming traffic being blocked by pf, although I was seeing people playing with sipvicious. I decided to firewall off my SIP ports, only exposing them to Internode’s SIP server.

Later on the help-desk got back to me. Despite their server telling me 200: OK when I sent a REGISTER, the number was still “not registered”. Confusing!

On a whim, I ripped out siproxd again and set up NAT on the border router. BINGO! We registered, and incoming calls worked. Internode use Broadsoft’s BroadWorks platform for their NodePhone VoIP service, and something about the operation of siproxd confused it. It then sent a misleading response leading my devices to think they were registered when they were not!

At this point, it was time to do two things: uninstall siproxd, and start reading up on Asterisk.

Configuring Asterisk

Asterisk is regarded as the “swiss army knife” of VoIP. It can be configured to do a lot. Officially, it is supported on Linux i386 and AMD64 platforms, but there are builds of it for platforms such as the Raspberry Pi.

OpenBSD also build and ship it in their ports. I wanted to avoid the pain of NAT, and I also wanted to avoid the telephone service going off-line if my cluster went haywire. So installation of this on the border router was a no-brainer. Yes, it’s adding a bit more attack surface to that box, but with appropriate firewalling rules, this could be managed.

As for configuration, there were two SIP channel drivers I could use, either the old chan_sip method, or the newer res_pjsip method. I had seen guides that discussed the older method (e.g. OCAU, WP), however in the back of my mind is the question: “how long do Digium plan to keep supporting two methods?” Thus from the outset I decided to use res_pjsip.

Audio CODECs

This is a pretty big aspect of configuration and should not be skimped on. You’ll find even if you buy all your equipment from one supplier, there are differences in what audio CODECs are supported. For instance the HT814 supports the OPUS CODEC (something I’d like to take advantage of eventually) but the GXP1615 does not. Some CODECs require patent licenses (e.g. SILK, SIREN7, G.722.1, G.722.2/AMR-WB), some did require patent licenses but are now “free” (G.729, G.723.1) and some are open-source (OPUS, Speex).

Some also go by multiple names. G.711a is also called PCMA and G.711u is PCMU. Pretty much everything supports these, and depending on the country you’re in, one will be preferred over the other. In my case, G.711a is the preferred option.

Your voice provider is a factor here too. Some only support particular CODECs, some will allow any CODEC. NodePhone VoIP allegedly will allow you to use whatever you like, but calls into and out of their network to non-VoIP targets may be restricted to narrow-band CODECs.

Internode-specific gotcha: G.729

One gotcha I stumbled on the hard way was that sometimes SIP implementations do not play by the rules.

Specifically, I found once I got Asterisk installed, incoming calls would be immediately hung-up on the moment I answered. Mobile or PSTN didn’t matter. I fired up tshark again and dug into the problem. The only clue I had was this error message:

[Apr  5 16:27:40] WARNING[-1][C-00000001] channel.c: Unable to find a codec translation path: (g729) -> (alaw)

Even if I specified only use G.711a (alaw), somehow I’d still get that message. I asked about it on the Asterisk forum. I was seeing this pattern in my SIP traffic:

|Time     | ${PROVIDER}                           |
|         |                   | ${ASTERISK_BOX}   |                   
|0.000000 |         INVITE SDP (g711A g7          |SIP INVITE From: <sip:${MYPSTNNUM}@${PROVIDER}29;user=phone> To:"${PROVIDER_NAME}"<sip:${MYSIPNUM}@${PROVIDER_DOMAIN}> Call-ID:BW061858648050420-1965318496@${PROVIDER}   CSeq:612303213
|         |(5060)   ------------------>  (5060)   |
|0.003443 |         100 Trying|                   |SIP Status 100 Trying
|         |(5060)   <------------------  (5060)   |
|0.025416 |         180 Ringing                   |SIP Status 180 Ringing
|         |(5060)   <------------------  (5060)   |
|0.954015 |         200 OK SDP (g711A g7          |SIP Status 200 OK
|         |(5060)   <------------------  (5060)   |
|0.986287 |         RTP (g711A)                   |RTP, 6 packets. Duration: 0.099s SSRC: 0x4F53507
|         |(27642)  <------------------  (18024)  |
|1.092729 |         RTP (g729)                    |RTP, 3 packets. Duration: 0.041s SSRC: 0xCD74F85
|         |(27642)  ------------------>  (18024)  |
|1.097523 |         ACK       |                   |SIP Request INVITE ACK 200 CSeq:612303213
|         |(5060)   ------------------>  (5060)   |
|1.099552 |         BYE       |                   |SIP Request BYE CSeq:29850
|         |(5060)   <------------------  (5060)   |
|1.149521 |         200 OK    |                   |SIP Status 200 OK
|         |(5060)   ------------------>  (5060)   |

I was reliably informed that this was definitely against the rules. So another help-desk email informing them of the problem. If you’re an Internode NodePhone VoIP customer, and you see the above behaviour, these are your options:

  1. Purchase and Install Digium’s G.729 CODEC: only an option if you are running Asterisk on a i386 or AMD64-based Linux machine.
  2. If, like me, you’re running Asterisk on something else, or you despise proprietary software, there is an open-source G.729 CODEC. On OpenBSD, install the asterisk-g729 package.
  3. Asterisk have added a work-around in later versions of their code. Patches exist for version 17, 16 and 13. It is also included in 13.32.0, 16.9.0 and 17.3.0. OpenBSD 6.7 will likely ship with a version of Asterisk that includes this work-around.
  4. You can also complain to their help-desk. We can work-around their problem, but really, it’s their end that’s doing the wrong thing, they should fix it.

Dial Plans

The other big thing to consider is your dial-plan layout. Every SIP endpoint is assigned a context which is used in extensions.conf to determine what is meant when a particular sequence of digits is entered or how a call should be routed.

The pattern syntax is documented on their Pattern Matching page. Notably, extensions are “literal” if they do not begin with an underscore (_) and a X matches any numeric digit. So an example:

exten => 123456,1,DoSomething()
exten => 123987,1,DoSomethingDifferent()
exten => _123XXX,1,DoSomethingElse()

The non-pattern extensions take precedence. So the number 123456 would exactly match the first extension listed above and would call DoSomething(). The number 123987 would exactly match the second entry and would call DoSomethingDifferent(). Any other 6-digit number beginning with 123 would trigger DoSomethingElse().

Avoiding expensive mistakes

Before doing anything, it’s worth reading Asterisk’s page on Dial plan Security. You do NOT want someone to call your PBX, then dial outbound to expensive international numbers and run up a big phone bill! In particular, you want to keep the default context as lean as possible! It’s fine to put all your internal extensions in default, but anything that dials outside, should be in a separate context for internal extensions.

Dialling more than one phone at a time

It is possible to have a dial-plan entry ring multiple phones in parallel by separating the endpoints with & symbols, but don’t put spaces around the & characters! So Dial(PJSIP/ep1&PJSIP/ep2&PJSIP/ep3), not Dial(PJSIP/ep1 & PJSIP…). The most obvious use case for this is when someone rings the home number and you want all phones to ring.

Incoming calls in the dial plan

When a call comes in from outside, the number dialled by the outside party (i.e. your number) appears as the extension dialled.

A simple option is to just ring everyone… so if your phone number was, say 0735359696, you’d create a context in your extensions.conf like this:

; Incoming calls
exten => 0735359696,1,Dial(PJSIP/ep1&PJSIP/ep2&…)

… then in pjsip.conf, assign that context to the endpoint:

context = incoming
; … etc

Outgoing calls

Usually you want to be able to ring outbound too. Some guides suggested the pattern _X! can be used to “match all” but I couldn’t get this to work.

Fax over IP

I mentioned that we had a fax machine we wanted to continue using. Whilst we don’t use it every day, it’s nice to know it is there if we want to use it.

Likewise with the dial-up modem. Even if I needed to do reduced-speed for it to work, it’d be better than nothing for situations where I need to use dial-up. It’ll never connect to the Internet again, but that doesn’t make it useless.

There are two ways to do Fax over IP. One is to use G.711u and hope for the best. The other is to use T.38, where the ATA basically “spoofs” the fax modem at the remote end, decodes the symbols being sent back to digital data, encodes that in UDP packets and transmits those over the Internet. The other end then reverses the process.

The good news is there are diagnostic tools out there for testing purposes. You don’t (yet) have to know of someone who has a working fax. Here in Australia, there’s FOLDS-B.

I’d recommend if you’ve got multiple fax-capable devices though, you plug two of them into separate ports on one or more ATAs and try faxing from one extension to the other. I tried calling FOLDS-B directly at first with no luck, then tried setting up Asterisk with an extension that calls SendFax to send a TIFF image, but had no luck — handshaking would be cut short after a second.

Using two internal endpoints saved a lot of phone calls externally and allowed me to “prove” the ATA could work for this.

Once you’ve got things working locally, you’re ready to hit the outside world. You’ll need a fairly complex image to send as it needs to be transmitting for ~70 seconds. I tried a few pages (e.g. this one) on the flatbed scanner of the WF-7510 without much luck… the fax would barely muster 20 seconds of data for them even at “fine” levels.

I had better luck using the 56kbps modem and a copy of efax-gtk. I took RCA’s test card image off Wikipedia as a SVG, loaded that up into The Gimp, rendering the SVG at 600dpi, extending the image size to A4-paper sized, rotated it to portrait mode, then converted it to 1-bit-per-pixel monochrome with patterned dithering and saved it as an uncompressed TIFF.

I then passed this through tiff2ps which gave me this file:

Sending that at 14400bps took 72 seconds. Perfect! Then the reply came back. Fax reception is very much a work-in-progress, so rather than receive on the VoIP line, I decided to use the PSTN to receive the reports. This was accomplished by specifying my PSTN phone number in the fax identity (FOLDS-B evidently doesn’t check this against caller ID).

I tried again, and this time, received the report. FOLDS-B got a garbled mess apparently. The suggestion was to set the speed to 4800bps. In efax this is accomplished by setting the modem capabilities:

       The capabilities of the local hardware and software can be set using a string of 8 digits separated by commas:



       vr  (vertical resolution) =
                0 for 98 lines per inch
                1 for 196 lpi

       br  (bit rate) =
                0 for 2400 bps
                1 for 4800
                2 for 7200
                3 for 9600
                4 for 12000 (V.17)
                5 for 14400 (V.17)

So in this case, I should set the capabilities to 1,1,0,2,0,0,0,0. I tried this, but then found the call just didn’t negotiate either. On a whim again I tried 7200bps (that’s 1,2,0,2,0,0,0,0). Eureka! I got this:

A pretty much “perfect” FOLDS-B report at 7200bps.

The transmission level and SNR are pretty much spot-on. Emboldened by this, I tried again faxing the same image (which I had printed out) from the WF-7510. I chose “Photo” quality and scanned it on the flat-bed. This didn’t work so well — evidently some detail was missed and it only transmitted for 60 seconds so I tried again, and faxed something else for the second sheet.

FOLDS-B wasn’t happy about the second page, but at least it had something to go on. The fax modem in the WF-7510 doesn’t appear to have a speed control other than turning V.34 mode (33.6kbps) on/off. It appears it tried sending at 14400bps. FOLDS-B tells a sorry tale:

An ugly 14400bps transmission

Now it is possible to get near perfect results at 14400bps through an ATA that supports T.38. Maybe a longer transmission on the first page might have helped, but fair to say the speed is not helping matters.

Transmit level is very quiet, and I can’t see a way to adjust this on the WF-7510. Nor can I force it to V.29 (7200bps) mode.

There’s allegedly a “reference” test sheet you can get by “receive polling” 019725112. When I ring this number (with a fax or telephone) I get a recorded message saying the number is not valid. If anyone knows what the correct number is, or how to get a copy of this reference sheet, it’d be greatly appreciated.

So yes, Fax over IP is doable… a lot of pissing about with ATA settings… and you better hope the fax machine itself has lots of knobs and dials. Hylafax + t38modem will probably crack this nut. Then again, so will email.

Configuration Settings

Asterisk Configuration

So whilst my set-up is still a work-in-progress, I figured I’d post what I have here.

Most of these are defaults shipped with OpenBSD 6.6. Specifically of interest would be pjsip.conf and extensions.conf.

pf firewall rules

You’re going to want to pierce your firewall appropriately to expose yourself just enough for your VoIP service to work.

# Define some interfaces

# SIP addresses
sip_provider=  #
sip_endpoints="{, }"

# Allow SIP from router to SIP devices/softphones and vice versa
pass in on $internal proto udp from $sip_endpoints to self port { 5060, 10000:30000 }
pass out on $internal proto udp from self port { 5060, 10000:30000 } to $sip_endpoints

# Allow SIP from Internode to self and vice versa.  This could be
# tightened up a lot further.  Maybe try some calls both ways and log
# the traffic to see which specific ports.  "All UDP ports" is in line
# with Internode recommendations:
pass in on egress inet proto udp from $sip_provider to self
pass out on egress inet proto udp from self to $sip_provider

Grandstream HT814 settings

Some notable settings first before I dump full screenshots…

Ring Cadence Settings

A special thank-you to @scottsip on the Grandstream forums for pointing me to this document from the ITU (Page 4 has the settings for Australia). Also worth mentioning is this Whirlpool forum thread and this review/teardown of the Grandstream HT802. The ones marked (?) I have no idea what they’re for.

  • System Ring Cadence: c=400/200-400/2000;
  • Dial Tone: f1=400@-12,f2=425@-12;
  • Busy Tone: f1=425,c=38/38;
  • Reorder Tone: f1=480,f2=620,c=25/25; (?)
  • Confirmation Tone: f1=350@-11,f2=440@-11,c=100/100-100/100-100/100; (?)
  • Call Waiting Tone: f1=425,c=20/20-20/440;
  • Prompt Tone: f1=350@-17,f2=440@-17,c=0/0; (?)
  • Conference Party Hangup Tone: f1=425@-15,c=600/600;

Notable fax-related settings

I changed lots of these trying to get something working, so if you set this and still don’t get any joy, check the screenshots below.

  • Fax Mode: T.38
  • Re-INVITE After Fax Tone Detected: Enabled
  • Jitter Buffer Type: Fixed
  • Jitter Buffer Length: Low (note, I can get away with this because it’s Ethernet LAN from ATA to Asterisk box)
  • Gain:
    • TX: 0dB
    • RX: 0dB
  • Disable Line Echo Canceller: Yes
  • Disable Network Echo Suppressor: Yes


Grandstream GXP1615 Settings

These are much the same as the HT814 above. The cadence settings use a slightly different syntax (they don’t have the @nn parts) and there are more tone settings. Again, the ones marked with (?) have an unknown purpose.

  • System Ringtone: c=400/200-400/2000;
  • Dial Tone: f1=400,f2=425;
  • Second Dial Tone: f1=450,f2=425; (?)
  • Message Waiting: f1=525;
  • Ring Back Tone: f1=1209,f2=852,c=20/20-20/200;
  • Call-Waiting Tone: f1=425,c=20/20-20/440;
    • Gain: Low
  • Busy Tone: f1=425,c=38/38;
  • Reorder Tone: f1=480,f2=620,c=25/25; (?)
GXP1615 ringtone settings

The steps forward

Things I need to figure out with this system:

  • Feature Codes: these are used to signal events like “transferring calls”, “park”, and other features you might want to initiate in the PBX. They are normally signalled using DTMF tone sequences, however this can give problems if your tones clash with some IVR you’re interacting with. I’m currently researching this area.
  • OPUS support: I mentioned the HT814 supports it, and as it’s an open CODEC I’d like to support it. There is this project which provides OPUS support to Asterisk. I’ll probably look at writing an OpenBSD port for it.
  • Voice mail and IVR… I’m thinking something along the lines of “Dial the year of birth of the person you wish to reach”… then that can ring the relevant phones or offer to leave a message.
  • I’d like to test wideband audio at some point. Work seems to have G.722 on their phones (or maybe its OPUS?), I should see if wideband pass-through in fact works.