Feb 152019
 

One problem I face with the cluster as it stands now is that 2.5″ HDDs are actually quite restrictive in terms of size options.

Right now the whole shebang runs on 1TB 5400RPM Hitachi laptop drives, which so far has been fine, but now that I’ve put my old server on as a VM, that’s chewed up a big chunk of space. I can survive a single drive crash, but not two.

I can buy 2TB HDDs, WD make some and Scorptec sell them. Seagate make some bigger capacity drives, however I have a policy of not buying Seagate.

At work we built a Ceph cluster on 3TB SV35 HDDs… 6 of them to be exact. Within 9 months, the drives started failing one-by-one. At first it was just the odd drive being intermittent, then the problem got worse. They all got RMAed, all 6 of them. Since we obviously needed drives to store data on until the RMAed drives returned, we bought identically sized consumer 5400RPM Hitachi drives. Those same drives are running happily in the same cluster today, some 3 years later.

We also had one SV35 in a 3.5″ external enclosure that formed my workplace’s “disaster recovery” back-up drive. The idea being that if the place was in great peril and it was safe enough to do so, someone could just yank this drive from the rack and run. (If we didn’t, we also had truly off-site back-up NAS boxes.) That wound up failing as well before its time was due. That got replaced with one of the RMAed disks and used until the 3TB no longer sufficed.

Anyway, enough of that diversion, long story short, I don’t trust Seagate disks for 24/7 operation. I don’t see other manufacturers (other than Seagate e.g. WD, Samsung, Hitachi) making >2TB HDDs in the 2.5″ form factor. They all seem to be going SSD.

I have a Samsung 850EVO 2TB in the laptop I’m writing this on, bought a couple of years ago now, and so far, it has been reliable. The cluster also uses 120GB 850EVOs as OS drives. There’s now a 4TB version as well.

The performance would be wonderful and they’d reduce the power consumption of the cluster, however, 3 4TB SSDs would cost $2700. That’s a big investment!

The other option is to bolt on a 3.5″ HDD somehow. A DIN-rail mounted case would be ideal for this. 3.5″ high-capacity drives are much more common, and is using technology which is proven reliable and is comparatively inexpensive.

In addition, by going to bigger external drives it also means I can potentially swap out those 2.5″ HDDs for SSDs at a later date. A WD Purple (5400RPM) 4TB sells for $166. I have one of these in my desktop at work, and so far its performance there has been fine. $3 more and I can get one of the WD Red (7200RPM) 4TB drives which are intended for NAS use. $265 buys a 6TB Toshiba 7200RPM HDD. In short, I have options.

Now, mounting the drives in the rack is a problem. I could just make a shelf to sit the drive enclosures on, or I could buy a second rack and move the servers into that which would free up room for a second DIN rail for the HDDs to mount to. It’d be neat to DIN-rail mount the enclosures beside each Ceph node, but right now, there’s no room to do that.

I’d also either need to modify or scratch-make a HDD enclosure that can be DIN-rail mounted.

There’s then the thorny issue of interfacing. There are two options at my disposal: eSATA and USB3. (Thunderbolt and Firewire aren’t supported on these systems and adding a PCIe card would be tricky.)

The Supermicro motherboards I’m using have 6 SATA ports. If you’re prepared to live with reduced cable lengths, you can use a passive SATA to eSATA adaptor bracket — and this works just fine for my use case since the drives will be quite close. I will have to power down a node and cut a hole in the case to mount the bracket, but this is doable.

I haven’t tried this out yet, but I should be able to use the same type of adaptor inside the enclosure to connect the eSATA cable to the HDD. Trade-off will be further reduced cable distances, but again, they don’t need to go more than 30cm, it’ll most likely work fine.

The other interface option is USB 3.0. The motherboards have two back-panel USB 3.0 connectors and inside, two USB 3.0 ports I can potentially expose. This can be hot-plugged without changing my cluster as it stands now. The down-side is that USB incurs a greater CPU overhead than SATA.

During my migration to BlueStore, I used exactly this to provide a “temporary” OSD disk… a 1TB 7200RPM WD black in a HDD dock. The performance of that was fine, and in that case, I was willing to put up with the overhead as it was temporary.

External eSATA cases seem to be going the way of the dodo, I haven’t seen many available for sale from my usual suppliers. USB 3.0 seems to have taken over, probably because for most uses, it is “good enough”. I did ask about whether one is preferred over the other for Ceph OSD use on the Ceph mailing list, but heard nothing.

As it was, prior to undertaking the migration, I bought such a case, an el’cheapo Simplecom SE-325, along with a 4TB WD Blue for the actual drive. I was tossing up between that, and a LaCiE “Porsche” 4TB drive, but the winning factor of this was that I’d know what I was buying — the LaCiE drive could have had anything in there, manufacturers can and sometimes do substitute components in different manufacturing runs, buying the case and drive separately didn’t run that risk.

The case and drive did the job. I hooked the drive up to my laptop (I had forgotten xhci_hcd support in the storage nodes’ kernels, which I have since fixed) and pulled a snapshot of every VM disk (Rados block device) off the Ceph cluster onto this drive as a raw disk image so I would not lose data. The drive easily kept up with the GbE link I had to the downstairs switch, and a core in the Core i5-3320M in this laptop is probably on par with the ones in the Avoton C2750s running the show.

To DIN-rail mount this, I’d need to make a cradle to take the case, and I’d need to hack some forced-ventilation into the top cover, which isn’t a difficult job. (Drill some holes, then use a nibbler tool to cut slots, then mount a small fan.)

The original PSU for this case is a 12V 2A wall wart, easily substituted with a 12V 3A LDO such as the LM1085IT-12. I may even be able to squeeze it and a heatsink into the case. I presently use one of these with the border router with a small heatsink, and so far, no problems.

If I later want eSATA, I can unscrew the original PCB and should be able to hack that in.

Short term, I can place a temporary shelf atop the battery cases and sit the HDDs there until I figure out more permanent arrangements.

Right now I’ve been battling a few health problems (sharp-eyed readers may recognise the box of “gunk” in the background which is now empty and the accompanying documentation — I’ll know more next Friday morning), and so I’ll wait until I know the outcome of those tests as there’s no point in building something grand if I’m not going to be around to enjoy it.

Jan 282019
 

My cloud computing cluster like all cloud computing clusters of course needs a storage back-end. There were a number of options I could have chosen, but the one I went with in the end was Ceph, and so far, it’s ran pretty well.

Lately though, I was starting to get some odd crashes out of ceph-osd. I was running release 10.2.3, which is quite dated now, this is one of the earlier Jewel releases. Adding to the fun, I’m running btrfs as my filesystem on the OS and the OSD, and I’m running it all on Gentoo. On top of this, my monitor nodes are my OSDs as well.

Not exactly a “supported” configuration, never mind the hacks done at hardware level.

There was also a nagging issue about too many placement groups in the Ceph cluster. When I first established the cluster, I christened it by dragging a few of my lxc containers off the old server and making them VMs in the cluster. This was done using libvirt and virt-manager. These got thrown into a storage pool called transitional-inst, with a VLAN set aside for the VMs to use. When I threw OpenNebula on, I created another Ceph pool, one for its images. The configuration of these lead to the “too many placement groups” warning, which until now, I just ignored.

This weekend was a long weekend, for controversial reasons… and so I thought I’ll take a snapshot of all my VMs, download those snapshots to a HDD as raw images, then see if I can fix these issues, and migrate to Ceph Luminous (v12.2.10) at the same time.

Backing up

I was going to be doing some nasty things to the cluster, so I thought the first thing to do was to back up all images. This was done by using rbd snap create pool/image@date to create a snapshot of an image, then rbd export pool/image@date /path/to/storage/pool-image.img before blowing away the snapshot with rbd snap rm pool/image@date.

This was done for all images on the Ceph cluster, stashing them on a 4TB hard drive I had bought for the purpose.

Getting things ready

My cluster is actually set up as a distcc cluster, with Apache HTTP server instances sharing out distfiles and binary package repositories, so if I build packages on one, I can have the others fetch the binary packages that it built. I started with a node, and got it to update all packages except Ceph. Made sure everything was up-to-date.

Then, I ran emerge -B =ceph-10.2.10-r2. This was the first step in my migration, I’d move to the absolute latest Jewel release available in Gentoo. Once it built, I told all three storage nodes to install it (emerge -g =ceph-10.2.10-r2). This was followed up by a re-start of the mon daemons on each node (one at a time), then the mds daemons, finally the osd daemons.

Resolving the “too many placement groups” warning

To resolve this, I first researched the problem. An Internet search lead me to this Stack Overflow post. In it, it was suggested the problem could be alleviated by making a new pool with the correct settings, then copying the images over to it and blowing away the old one.

As it happens, I had an easier solution… move the “transitional” images to OpenNebula. I created empty data blocks in OpenNebula for the three images, then used qemu-img convert -p /path/to/image.img rbd:pool/image to upload the images.

It was then a case of creating a virtual machine template to boot them. I put them in a VLAN with the other servers, and when each one booted, edited the configuration with the new TCP/IP settings.

Once all those were moved across, I blew away the old VMs and the old pool. The warning disappeared, and I was left with a HEALTH_OK message out of Ceph.

The Luminous moment

At this point I was ready to try migrating. I had a good read of the instructions beforehand. They seemed simple enough. I prepared as I did before by updating everything on the system except Ceph, then, telling Portage to build a binary package of Ceph itself.

Then I deployed the binary to the three nodes.

First step was to re-start the monitors… this went smoothly, I just did a /etc/init.d/ceph-mon.${HOST} restart on each one individually, and after a brief moment, quorum was re-established. I then deployed a manager daemon to each one — basically I just “copied” my monitor symbolic link, changing mon to mgr, added it to OpenRC’s list, then started them. No problems.

The OSDs though were still running the Jewel release.

I proceeded as before, trying a re-start of the first OSD. After a while it hadn’t come back…

2019-01-27 14:42:59.745860 7f28fac06e00 -1 filestore(/var/lib/ceph/osd/ceph-0) _detect_fs(1197): deprecated btrfs support is not ena
bled

Ohh bugger, so no btrfs support. This is where the fun began. At this point I was a bit flustered and thought I’d have to either migrate these nodes to XFS, or to BlueStore. So immediately I started looking at the BlueStore migration documentation, as I did not want to risk re-starting the other two OSDs and losing access to my data!

A hasty BlueStore migration

So, I started this by doing the ceph osd set out 0 to start my now downed OSD 0 on the path of migration. The fact it was already down didn’t click with me. I then tried running ceph osd safe-to-destroy 0, only to be told Error EINVAL: (22) Invalid argument.

Uhh ohh, this isn’t good. I waited a bit, but also part of me said: there should be a copy of everything on this node, on at least one of the other two nodes. I had configured it to maintain at least two copies of everything, so even if this node went up in smoke, the data should be recoverable.

With great trepidation, I continued and tried destroying the OSD, then creating a BlueStore one in its place… only to have the ceph-volume command blow up. It couldn’t find the keyring, then when I got that sorted out, it was failing to talk to systemd, then when I found the --no-systemd argument, it still failed because of LVM. I therefore realised I needed two things:

  1. I needed the bootstrap-osd keyring that ceph-deploy normally creates.
  2. The lvmetad daemon must be running.

For (1), this is taken care of with the following commands:

# ceph auth add client.bootstrap-osd --cap mon 'profile bootstrap-osd
# mkdir /var/lib/ceph/bootstrap-osd
# ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring

As for (2), install sys-fs/lvm and add lvmetad to your start-up services. Also add lvm, as you’ll want that at boot. (I learned this later.)

After doing that, the following command worked:

ceph-volume lvm create --bluestore --data /dev/sdb \
--osd-id 0 --no-systemd

The --no-systemd is important on Gentoo with OpenRC as there is no systemctl binary. Once I did that, I found I could start my OSD again. Data recovery began at once. The data recovery was an overnight effort — it took with my hardware until 3PM today to migrate all the placement groups over to the newly re-formatted OSD.

Migrating the other nodes

For now, they still run btrfs. In my “ohh crap” state, I didn’t see the little hint given:

2019-01-27 14:40:55.147888 7f8feb7a2e00 -1 *** experimental feature 'btrfs' is not enabled ***
This feature is marked as experimental, which means it
 - is untested
 - is unsupported
 - may corrupt your data
 - may break your cluster is an unrecoverable fashion
To enable this feature, add this to your ceph.conf:
  enable experimental unrecoverable data corrupting features = btrfs

2019-01-27 14:40:55.147901 7f8feb7a2e00 -1 filestore(/var/lib/ceph/osd/ceph-0) _detect_fs(1197): deprecated btrfs support is not enabled
2019-01-27 14:40:55.147906 7f8feb7a2e00 -1 filestore(/var/lib/ceph/osd/ceph-0) mount(1523): error in _detect_fs: (1) Operation not permitted
2019-01-27 14:40:55.147926 7f8feb7a2e00 -1 osd.0 0 OSD:init: unable to mount object store

Not feeling like a 24-hour wait, I did as it told me:

osd pool default size = 2  # Write an object n times.
osd pool default min size = 1 # Allow writing n copy in a degraded state.
osd pool default pg num = 128
osd pool default pgp num = 128
osd crush chooseleaf type = 1
osd max backfills = 10

# Allow btrfs to work:
enable experimental unrecoverable data corrupting features = btrfs

Now, my other OSDs re-started successfully, and I could finally finish off by restarting the metadata daemons and completing the migration. I’m now left with two OSDs with BTRFS and one with BlueStore.

For now, I’ll leave it that way, next week end, I might migrate a second node to BlueStore.

The reboot test

I needed to ensure the nodes would come back without my intervention. So starting with the two BTRFS nodes, I rebooted each one individually. The OSD on that node first went offline, then the monitor, finally the cluster noticed the metadata and manager services had gone. Then, upon successful boot, the services returned.

So far so good. Now the BlueStore node.

First reboot, my OSD didn’t come back. On investigation, I saw the following logs:

2019-01-28 16:25:59.312369 7fd58d4f0e00 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or
directory
2019-01-28 16:26:14.865883 7fe92f942e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or
directory
2019-01-28 16:26:30.419863 7fd4fa026e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory

/var/lib/ceph/osd/ceph-0 was completely empty! Bugger, do I have to endure those 24 hours again? As it happened, no. I don’t know how the files in that directory disappeared, I did observe a tmpfs pseudovolume mounted at that directory earlier when trying to create the OSD … maybe that didn’t get unmounted before OSD creation, anyway, the files were gone.

A bit of digging revealed a ceph-bluestore-tool utility, with options like repair. At first I tried to wing it using that, but no dice. Then looking at the man page I noticed the sub-command prime-osd-dir. BINGO.

At first I threw the raw device at it, but as it happens, ceph-volume had deployed LVM to the raw disk, then put BlueStore on top of that. Starting lvm got the volume group recognised, so I added that to my boot-up services (see why I mentioned it earlier). It had created a sym-link to the LVM volume in /dev/ceph-${UUID1}/osd-block-${UUID2}.

No idea where the two UUIDs came from, but I tried this:

# ceph-bluestore-tool prime-osd-dir \
    --dev /dev/ceph-d62d0d95-2e13-4c59-834d-03a87b88c85e/osd-block-62b4be3e-3935-4d51-ab5c-dde077f99ea3 \
    --path /var/lib/ceph/osd/ceph-0

That populated the directory with files, so I tried again starting the OSD.

2019-01-28 16:59:23.680039 7fd93fcbee00 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (13) Permission denied
2019-01-28 16:59:23.680082 7fd93fcbee00 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
2019-01-28 16:59:39.229888 7f4a585b4e00 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (13) Permission denied
2019-01-28 16:59:39.229918 7f4a585b4e00 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory

Ah ha, chown -R ceph:ceph /var/lib/ceph/osd/ceph-0, and all sprang to life. The OSD came up.

Testing the fixes, a second re-boot

Since the OSD now was starting, and working, I did a second re-boot test, only to have history partially repeat itself.

The files were still there this time, but it was failing with a permissions error opening the block device. Sure enough, it was now owned by root.

Changed the permissions, and the OSD came up.

Fixing this was a job for udev:

cat /etc/udev/rules.d/99ceph.rules
SUBSYSTEM=="block", KERNEL=="sda7", OWNER="ceph", GROUP="ceph", MODE="0600"
SUBSYSTEM=="block", ENV{DM_VG_NAME}=="ceph-*", OWNER="ceph", GROUP="ceph", MODE="0600"

The first line is left-over from when /dev/sda7 was my journal. Not sure what I’ll do with this partition now, I’ll think of something (maybe Docker). The second line tells udev to change the permissions on the volume group that Ceph created.

Having done this, I rebooted again. This time, all worked. The OSD came up without my intervention.

Recap

So, the pitfalls I ran across in my Jewel-Luminous migration on Gentoo.

btrfs OSDs

I had btrfs volumes for my OSDs, which are now frowned upon and considered experimental. It isn’t necessary to migrate to BlueStore or XFS straight away, but for the OSDs to boot, you will need the following line in your /etc/ceph/ceph.conf before restarting your OSDs:

enable experimental unrecoverable data corrupting features = btrfs

ceph-volume expects the bootstrap-osd key.

To use ceph-volume, it for some reason expects to see the bootstrap-osd key in a hard-coded location. It won’t work with the default admin key.

This bootstrap key can be generated as follows:

# ceph auth add client.bootstrap-osd --cap mon 'profile bootstrap-osd
# mkdir /var/lib/ceph/bootstrap-osd
# ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring

Before creating a BlueStore OSD, make sure lvmetad and lvm are started (and set to start at boot)

You can get away with just lvmetad for the initial creation, but you’ll want lvm running at boot anyway to ensure all the logical volume groups get started at boot before Ceph goes looking for them.

So before attempting OSD creation, ensure LVM is installed, and set to start at boot.

ceph-osd runs as the ceph user

So your udev rules need to reflect that. Luckily, ceph-volume seems to prefer creating LVM volume groups named ceph-${UUID}. I don’t know what decides the UUID value, but thankfully udev supports globbing. The following udev rule (put it in /etc/udev/rules.d/99ceph.rules or wherever seems appropriate) will keep permissions in check:

SUBSYSTEM=="block", ENV{DM_VG_NAME}=="ceph-*", OWNER="ceph", GROUP="ceph", MODE="0600"

(The above should be all on one line.)

Before rebooting a BlueStore node, back up your OSD data directories

Shouldn’t be strictly necessary, but now I’ve been bitten, I’m going to be taking extra care of that data directory on my other two nodes when I migrate them. I don’t fancy playing around with ceph-bluestore-tool frantically trying to get an OSD back up again.

Jan 192019
 

Recently, I’ve been looking at the problem of how to retrieve IPv6 traffic from the network stack of my workstation and manipulate it for transmission over AX.25.

My last experiments focussed on the TUN/TAP interface in Linux. Using this interface, I could create a virtual network interface that piped its traffic to a file descriptor in a program written in C.

One advantage of using the C language for this is that, as binding to the TAP interface requires root privileges, the binary could be installed setuid root. Thus, any time it started, it would be running as root. From there, it could do what it needed, then drop privileges back to a regular user.

The program would just run as a child process… when there was traffic received from the kernel, it would just spit that out to stdout. If my parent application had something to send, it would feed that into stdin.

6lhagent is an implementation of that idea. It’s pretty rough, but it seems to work. It uses a simple protocol to frame the Ethernet packets so that it can maintain synchronisation with the parent process. All frames are ACKed or NAKed, depending on whether they were understood or not. The protocol is analogous to KISS or SLIP in concept. The framing is very different to these protocols, but the concept is that of frames delimited by a byte sequence, with occurrences of the special byte sequences replaced with place-holders to prevent the parser getting confused.

I then wrote this Python script which uses the asyncio IO loop to run 6lhagent and dump the packets it receives:

$ python3 demo/dumper.py 
Interface data: b'V\xc7\x05\\yA\x05\x00\x00\x00\x00\xca\x04tap0'
Interface: MAC=[86, 199, 5, 92, 121, 65] MTU=1280 IDX=202 NAME=tap0
Ethernet traffic: b'33330000001656c7055c794186dd600000000024000100000000000000000000000000000000ff0200000000000000000000000000163a000502000001008f00f5ec0000000104000000ff0200000000000000000001ff5c7941'
From: 33:33:00:00:00:16
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: ::
To:   ff02::16
Length: 36, Next header: 0, Hop Limit: 1
Payload: b':\x00\x05\x02\x00\x00\x01\x00\x8f\x00\xf5\xec\x00\x00\x00\x01\x04\x00\x00\x00\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xff\\yA'
Ethernet traffic: b'33330000001656c7055c794186dd600000000024000100000000000000000000000000000000ff0200000000000000000000000000163a000502000001008f00f5ec0000000104000000ff0200000000000000000001ff5c7941'
From: 33:33:00:00:00:16
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: ::
To:   ff02::16
Length: 36, Next header: 0, Hop Limit: 1
Payload: b':\x00\x05\x02\x00\x00\x01\x00\x8f\x00\xf5\xec\x00\x00\x00\x01\x04\x00\x00\x00\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xff\\yA'
Ethernet traffic: b'3333ff5c794156c7055c794186dd6000000000203aff00000000000000000000000000000000ff0200000000000000000001ff5c79418700bebb00000000fe8000000000000054c705fffe5c79410e01a02d5c9a6698'
From: 33:33:ff:5c:79:41
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: ::
To:   ff02::1:ff5c:7941
Length: 32, Next header: 58, Hop Limit: 255
ICMP Type 135, Code 0, Checksum bebb
Data: b'\x00\x00\x00\x00\xfe\x80\x00\x00'
Payload: b'\x00\x00\x00\x00T\xc7\x05\xff\xfe\\yA\x0e\x01\xa0-\\\x9af\x98'
Ethernet traffic: b'33330000001656c7055c794186dd6000000000240001fe8000000000000054c705fffe5c7941ff0200000000000000000000000000163a000502000001008f0025070000000104000000ff0200000000000000000001ff5c7941'
From: 33:33:00:00:00:16
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: fe80::54c7:5ff:fe5c:7941
To:   ff02::16
Length: 36, Next header: 0, Hop Limit: 1
Payload: b':\x00\x05\x02\x00\x00\x01\x00\x8f\x00%\x07\x00\x00\x00\x01\x04\x00\x00\x00\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xff\\yA'
Ethernet traffic: b'33330000001656c7055c794186dd6000000000240001fe8000000000000054c705fffe5c7941ff0200000000000000000000000000163a000502000001008f009cab0000000104000000ff0200000000000000000000000000fb'
From: 33:33:00:00:00:16
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: fe80::54c7:5ff:fe5c:7941
To:   ff02::16
Length: 36, Next header: 0, Hop Limit: 1
Payload: b':\x00\x05\x02\x00\x00\x01\x00\x8f\x00\x9c\xab\x00\x00\x00\x01\x04\x00\x00\x00\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfb'

The thinking is that the bulk of the proof-of-concept will be done in Python. My reasoning for this is that it’s usually easier to prototype in a higher-level language than in C, and in this application, speed is not important. At best our network interface will be running at 9600 baud — Python will keep up just fine. Most of it will be at 1200 baud.

The Python code will do some packet filtering (e.g. filtering out the multicast NS messages, which are a no-no in RFC-6775) and to add options where required. It’ll also be responsible for rate-limiting the firehose-like output of the tap interface from the host so the AX.25 network doesn’t get flooded.

The proof of concept is coming together. Next steps are to implement an IPv6 stack of sorts in Python to dissect the datagrams.

Jan 122019
 

For 6LoWHAM, it could work that we just use the link-local address space to directly communicate between stations and leave it at that.

If I want to send a message to VK4BWI-5 from my station VK4MSL-9, I could just fire off a packet to fe80::6894:49ff:feae:7318 directed to my 6LoWHAM interface and be done with it. This then requires one of two things:

  1. that VK4BWI-5 can directly communicate with me
  2. that the intermediate stations know to forward my message on to that station

(1) is easy enough. (2) raises the question of “what is local”?

Supposing that this protocol took off, and suddenly the WIA decides to earmark special frequencies on a few bands for 6LoWHAM, with a fairly complete network stretching up the eastern seaboard of Australia. If my station sends a router solicitation from my home QTH in Brisbane, does someone in Melbourne really care to hear it? I’d wager this is a recipe for a very clogged packet network!

In Thread, the “link local” scope only gets you as far as the nodes that can directly hear you. It does mean that protocols like mDNS, which rely on the “link-local” multicast scope aren’t going to reach all nodes, but it also means that far flung nodes don’t need to listen to all the low-level chatter. For communications between nodes, an “on-mesh” prefix is used, and for mesh-wide multicast, a “realm-local” prefix of ff03::/64 is defined.

In truth, it’s highly unlikely that we’d have “one” single network. More likely it’ll be a mesh of interconnected networks with trunk links going via some other band (or perhaps VPNs over the Internet). For that to work, we can’t rely on just link-local networking, we actually need a routable network address for the mesh.

The Thread “mesh local” prefix is actually defined by the network’s extended IEEE-802.15.4 PAN ID, which is a 64-bit number that you define when setting up the network. Thread simply takes the most significant 40 bits of this, slaps fd in front and pads it out with zeros to 64-bits. The PAN ID 0x0123456789abcdef forms the subnet fd01:2345:6789::/64. This can be seen in the OpenThread sources.

This wastes 16-bits of address space normally reserved for the ULA subnet ID and throws away 24-bits of the PAN ID. For our network, we don’t need 16-bits worth of subnets, we just need one. We also don’t have a PAN ID in AX.25.

The thinking is, we’ll use a “group” address. This will be a regular AX.25 SSID, which will translate to a MAC which has the group bit set. (Exactly how I’ll differentiate between a station SSID and a group SSID I’m not sure. Probably will look at the destination IP, if it’s multicast then the group bit gets set.)

Supposing we were to use this for the International Rally of Queensland (an event which is now defunct), we might create a 6LoWHAM network with a group address of “IROQ19”. The MAC address used for group-wide communications would be 03:01:cd:e5:a9:f8.

We can derive a prefix from this MAC address. A ULA normally consists of a 7-bit ULA prefix, a 1-bit “global/local” bit, a 40-bit global ID, and a 16-bit subnet ID.

The ULA prefix is fc::/7. The global/local bit is always set to 1 (local) because no one has come up with a way that ULAs can be globally administered. 40 bits is a bit tight, we could truncate our MAC to 40 bits and ignore the subnet ID like Thread do, that gives us a subnet of fd03:1cd:5ea9::/64.

The last 3 bits of the SSID though, are like a subnet ID. So if we move those 3 bits to set the last 3 bits of the prefix, we can make some use of that subnet ID, but still waste 13 bits with zeros.

Alternatively, we can consider the global ID and subnet ID to be one 56-bit field. We effectively shrink the subnet ID to 3 bits. That gives us a 53-bit global ID, which now fits the remaining 45-bits of our MAC and leaves us with 8 bits left over.

We can discard the lowest two bits in the first byte of the MAC as those (the group and local bits) will be the same for all groups, so that gives us another two bits. 10 bits isn’t a lot, but it’s enough to encode “AR” (amateur radio) in ITA-2, thus giving us a recognisable subnet mask for all 6LoWHAM networks. We wind up with the following:

┌─ULA─┐L┌──"AR"──┐┌───────────── Network Address ──────────────┐
1111110100010010100000000000000111001101111001011010100111111000
└──┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┤
   f   d   1   2 : 8   0   0   1 : c   d   e   5 : a   9   f   8 /64

This actually has me thinking whether the call-sign part of the SSID should be right-padded out to make the network address consistent. Maybe my SSID to MAC algorithm could do with a tweak there as it may make routing easier as it’ll put all those zeros to the right.

In Thread, the mesh-local prefix isn’t route-able beyond the mesh, there’s a separate prefix handed out by border routers for that. In our case, I don’t think there’s any point in complicating matters by having more than one route-able prefix for a mesh. If a station participates in two networks that share a frequency, then sure, that node may have an address on each network, but each network should share a common identity.

Thus in the contrived example of having a large network along the coastline: it’d be an “inter-network” of smaller meshes, linked together via router nodes which know how to hop between them. Those routes may be via point-to-point microwave links, HF, Internet tunnels, etc.

The subnets used for these other networks may be assigned a “context identifier” which is 4-bits. I’ll have to figure out if there’s a sane way to do that on a given network. Most 802.15.4 networks have a “PAN co-ordinator” which could be looking after that. Thread networks elect a “leader” node.

Given the small number of identifiers, and the low probability of this being used, this should be manually administered. Even without a context ID being assigned, one can still route between the subnets, just that the full IPv6 address needs to be given for the foreign node, so you incur a 16-byte penalty doing so. Thus the context IDs will probably be handed out for “popular routes”, with the mesh prefix being “context 0”.

I haven’t yet given thought to how this “context” would be disseminated over the mesh or kept updated. That is a can of worms for another day.

Jan 122019
 

One of the aims of 6LoWHAM was to provide a means to send IPv6 traffic between user applications and the AX.25 network.

In order to do this, the applications have to have some way of injecting their IP traffic. The canonical way this is done is through the operating system’s TCP/IP stack. This requires that we have an interface to the operating system kernel in order to receive that IP traffic destined for the airwaves.

Now, we could write a kernel driver for this, but it’s going the long way around to do it. Especially as we intend to interface to software that runs in userspace for the actual transmission. Our driver at best would be just taking the raw Ethernet frame, extracting the IP part, and forwarding that back to our program running in userspace.

There’s a driver that does that for us: TUN/TAP. This driver can either create a TUNnel device, which forwards IP datagrams, or a TAP device, which forwards Ethernet frames. We’ll focus on the TUN mode of this driver here.

The idea is this will create an IP tunnel, with one side exposing a network device to the kernel, and the other side being a file descriptor in a userspace application that just reads and writes raw IP frames. How it generates and processes those frames is entirely up to the software author. Most famous uses for this device are VPNs, so taking the IP datagram, encrypting it, then encapsulating it in an IP datagram (usually UDP) to be sent over the Internet to some other peer, which reverses the process and writes the original packet to its tunnel file descriptor.

In our case, we’ll be dissecting it a bit to extract the key fields, then applying our own “compression” defined in the 6LoWHAM specs, then forwarding it on to our AX.25 stack (probably LinBPQ or Direwolf) to be sent as an AX.25 UI frame.

The first step in this journey was actually figuring out what the packets look like on a tunnel device. I created this little program to explore the idea.

It just needs the usual C toolchain and libraries on a Linux system. I tested with Gentoo and Linux kernel 4.15. Building it is a simple make command. If you then run the resulting binary as root, you’ll find a tun0 device (or maybe some other number) created.

Bring the interface up, and you should start to see some traffic as the host tries to talk to is new (and very much mute) peer:

RC=0 stuartl@rikishi ~/projects/6lowham/packetdumper $ make 
cc    -c -o linuxtun.o linuxtun.c
cc    -c -o main.o main.c
cc -o packetdumper linuxtun.o main.o
RC=0 stuartl@rikishi ~/projects/6lowham/packetdumper $ sudo ./packetdumper 
Password: 
^Z
[1]+  Stopped(SIGTSTP)        sudo ./packetdumper
RC=148 stuartl@rikishi ~/projects/6lowham/packetdumper $ sudo ip link set dev tun0 up
RC=0 stuartl@rikishi ~/projects/6lowham/packetdumper $ fg
sudo ./packetdumper
Flags: 0x0000  Protocol: 0x86dd
  48:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   0: 60 00 00 00 00 08 3a ff fe 80 00 00 00 00 00 00
  16: 5e be 89 41 7b 19 d5 60 ff 02 00 00 00 00 00 00
  32: 00 00 00 00 00 00 00 02 85 00 44 bd 00 00 00 00
Flags: 0x0000  Protocol: 0x86dd
  48:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   0: 60 00 00 00 00 08 3a ff fe 80 00 00 00 00 00 00
  16: 5e be 89 41 7b 19 d5 60 ff 02 00 00 00 00 00 00
  32: 00 00 00 00 00 00 00 02 85 00 44 bd 00 00 00 00

I didn’t bother to decode the IP datagram further, but if you look at the Wikipedia IPv6 Packet article, it isn’t difficult to see what’s going on. In this case, we can see it’s an IPv6 packet both from the Protocol field (0x86dd is the Ethertype for IPv6), and from the first 4 bits of the frame payload.

The traffic class and flow label are both 0s here. The IPv6 payload length is just 8 bytes, so most of this is in fact IPv6 header data. Next header is type 0x3a (IPv6 ICMP) and the hop limit is 255. This is followed by the source address (my laptop’s link-local address fe80::5ebe:8941:7b19:d560) and the destination address (all link-local routers multicast address ff02::2).

The ICMPv6 message is the last 8 bytes; and in this case, it’s type is 0x85 (router solicitation), the code is 0x00, the two bytes after that are the checksum and the message (4 bytes) is all zeros.

Quite how that address was chosen is something I’ll have to get to grips with. Yes, it’s SLAAC, but where did it get the hardware address from? That I’ll have to figure out.

The alternative is to use a TAP interface, which means I choose the MAC address, and thus can control what the SLAAC-derived address becomes. Ohh, and it goes without saying that the privacy extensions will be a big no no on the air: we’re relying on the fact that we can derive the IPv6 address from the SSID of the station both for technical reasons and to legally meet the requirements for stations to “identify” who they are and whom they are talking to. SLAAC privacy will make a mess of that.

So controlling this link-local address is a must. I guess next stop: let’s look at a tap device. I’ve just made some changes to explore the differences from the application end. There isn’t a lot of difference here.

RC=130 stuartl@rikishi ~/projects/6lowham/packetdumper $ sudo ./packetdumper -tap
Password: 
^Z
[1]+  Stopped(SIGTSTP)        sudo ./packetdumper -tap
RC=148 stuartl@rikishi ~/projects/6lowham/packetdumper $ sudo ip link set tap0 up
RC=0 stuartl@rikishi ~/projects/6lowham/packetdumper $ fg
sudo ./packetdumper -tap
Flags: 0x0000  Protocol: 0x86dd
  90:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   0: 33 33 00 00 00 16 ce 65 0c 34 48 34 86 dd 60 00
  16: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00
  32: 00 00 00 00 00 00 ff 02 00 00 00 00 00 00 00 00
  48: 00 00 00 00 00 16 3a 00 05 02 00 00 01 00 8f 00
  64: 27 22 00 00 00 01 04 00 00 00 ff 02 00 00 00 00
  80: 00 00 00 00 00 01 ff 34 48 34
Flags: 0x0000  Protocol: 0x86dd
  86:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   0: 33 33 ff 34 48 34 ce 65 0c 34 48 34 86 dd 60 00
  16: 00 00 00 20 3a ff 00 00 00 00 00 00 00 00 00 00
  32: 00 00 00 00 00 00 ff 02 00 00 00 00 00 00 00 00
  48: 00 01 ff 34 48 34 87 00 af 03 00 00 00 00 fe 80
  64: 00 00 00 00 00 00 cc 65 0c ff fe 34 48 34 0e 01
  80: 61 78 48 c1 ac aa

The big difference is now we have an Ethernet header prepended. The proto field in the packet information now duplicates what we can see in the Ethernet frame header (bytes 12 and 13), and the IPv6 packet starts from byte 14.

I think this is the mode 6LoWHAM will use. It’s possible to set the MAC address on the created tap0 device to whatever 46 bits we like, the remaining two bits in the MAC address are for defining whether the address is global or local (we’ll set ours to “local”), and the other sets whether this is a multicast or unicast address. The SLAAC address will closely match this address with two differences:

  1. The MAC will have the bytes 0xff 0xfe inserted into the middle.
  2. The “global/local” bit is inverted. So for the 2001:db8::/64 prefix:
    • aa:bb:cc:dd:ee:ff becomes 2001:db8::a8bb:ccff:fedd:eeff
    • a8:bb:cc:dd:ee:ff becomes 2001:db8::aabb:ccff:fedd:eeff

That latter point had me confused at first, I thought it might’ve been that a bit got cleared, but instead it’s just inverted, so completely reversible.

Dec 142018
 

So recently, I had a melt-down with some of the monitor wiring on the cluster… to counteract that, I have some parts on order (RS Components annoyingly seem to have changed their shipping policies, so I suspect I’ll get them Monday)… namely some thermocouple extension cable, some small 250mA fast-blow fuses and suitable in-line holders.

In the meantime, I’m doing without the power controller, just turning the voltage down on the mains charger so the solar controller did most of the charging.

This, isn’t terribly reliable… and for a few days now my battery voltage has just sat at a flat 12.9V, which is the “boost” voltage set on the mains charger.

Last night we had a little rain, and today I see this:

Battery voltage today… the solar charger is doing some work.

Something got up and boogied this morning, and it was nothing I did to make that happen.  I’ll re-instate that charger, or maybe a control-only version of the #High-power DC-DC power supply which I have the parts for, but haven’t yet built.

Nov 302018
 

It’s been a while since I posted about this project… I haven’t had time to do many changes, just maintaining the current system as it is keeps me busy.

One thing I noticed is that I started getting poor performance out of the solar system late last week.  This was about the time that Sydney was getting the dust storms from Broken Hill.

Last week’s battery voltages (40s moving average)

Now, being in Brisbane, I didn’t think that this was the cause, and the days were largely clear, I was a bit miffed why I was getting such poor performance.  When I checked on the solar system itself on Sunday, I was getting mixed messages looking at the LEDs on the Redarc BCDC-1225.

I thought it was actually playing up, so I tried switching over to the other solar controller to see if that was better (even if I know it’s crap), but same thing.  Neither was charging, yet I had a full 20V available at the solar terminals.  It was a clear day, I couldn’t make sense of it.  On a whim, I checked the fuses on the panels.  All fuses were intact, but one fuse holder had melted!  The fuse holders are these ones from Jaycar.  10A fuses were installed, and they were connected to the terminal blocks using a ~20mm long length of stranded wire about 6mm thick!

This should not have gotten hot.  I looked around on Mouser/RS/Element14, and came up with an order for 3 of these DIN-rail mounted fuse holders, some terminal blocks, and some 10A “midget” fuses.  I figured I’d install these one evening (when the solar was not live).

These arrived yesterday afternoon.

New fuse holders, terminal blocks, and fuses.

However, it was yesterday morning whilst I was having breakfast, I could hear a smoke alarm going off.  At first I didn’t twig to it being our smoke alarm.  I wandered downstairs and caught a whiff of something.  Not silicon, thankfully, but something had burned, and the smoke alarm above the cluster was going berserk.

I took that alarm down off the wall and shoved it it under a doonah to muffle it (seems they don’t test the functionality of the “hush” button on these things), switched the mains off and yanked the solar power.  Checking the cluster, all nodes were up, the switches were both on, there didn’t seem to be anything wrong there.  The cluster itself was fine, running happily.

My power controller was off, at first I thought this odd.  Maybe something burned out there, perhaps the 5V LDO?  A few wires sprang out of the terminal blocks.  A frequent annoyance, as the terminal blocks were not designed for CAT5e-sized wire.

By chance, I happened to run my hand along the sense cable (the unsheathed green pair of a dissected CAT5e cable) to the solar input, and noticed it got hot near the solar socket on the wall.  High current was flowing where high current was not planned for or expected, and the wire’s insulation had melted!  How that happened, I’m not quite sure.  I got some side-cutters, cut the wires at the wall-end of the patch cable and disconnected the power controller.  I’ll investigate it later.

Power controller with crispy wiring

With that rendered safe, I disconnected the mains charger from the battery and wound its float voltage back to about 12.2V, then plugged everything back in and turned everything on.  Things went fine, the solar even behaved itself (in-spite of the melty fuse holder on one panel).

Last night, I tore down the old fuse box, hacked off a length of DIN rail, and set about mounting the new holders.  I had to do away with the backing plate due to clearance issues with the holders and re-locate my isolation switch, but things went okay.

This is the installation of the fuses now:

Fuse holders installed

The re-located isolation switch has left some ugly holes, but we’ll plug those up with time (unless a friendly mud wasp does it for us).

Solar isolation switch re-located, and some holes wanting some putty.

For interest’s sake, this was the old installation, partially dismantled.

Old installation, terminal strips and fuse holders.You can see how the holders were mounted to that plate.  The holder closest to the camera has melted rather badly.  The fuse case itself also melted (but the fuse is still intact).

Melted fuse holder detail

The new holders are rated at 690V AC, 30A, and the fuses are rated to 500V, so I don’t expect to have the same problems.

As for the controller, maybe it’s time to retire that design.  The high-power DC-DC converter project ultimately is the future replacement and a first step may be to build an ATTiny24A-based controller that can poll the current shunt sensors and switch the mains charger on and off that way.

Nov 212018
 

Thinking about the routing problem a little more… if I wanted to do a purely “native” routing scheme not involving Net/ROM routing update broadcasts, one has to wonder what such a system would look like.

Net/ROM L3 is really just intended to “bootstrap” things… there’s the prospect of using Net/ROM L4 for tunnelling TCP traffic, but really it’s the L3 part that interests me as a way of hopping between fragments of the mesh that may be linkable via a non-6LoWHAM capable digipeater.

Net/ROM’s periodic broadcasts are inefficient, divulging a node’s entire routing table is not an ideal situation.  So what’s the alternative?  IPv6 nodes already send a “neighbour discovery” packet when they don’t know the MAC address of a neighbour, this is a trigger for a “neighbour advertisement” response.

I’m thinking 6LoWHAM will send NAs periodically anyway.  ACMA rules require identifying every 10 minutes.  Since the NA will include the call-sign of the station (in bit-shifted ASCII), doing that every 10 minutes takes care of the ACMA requirement.  An IPv6 NA message is not a big payload.

Given this will be sent to the ff02::1 multicast group, all nodes able to hear the beaconing station will receive it.  Unlike a IEEE 802.11 or 802.3 network though, not all nodes on the mesh will hear it.

The same is true of ND messages.  If the neighbour is in ear-shot and able to respond, it likely will, but that isn’t a guarantee.  Something in the link-local scope will likely be the answer, probably a daemon listening on a UDP port and sending to the ff02::1 group.

Unicast routing

When a station wishes to make contact with a station that’s not an immediate neighbour, I’m thinking of a broadcast similar to how APRS does things.  APRS uses special call-signs WIDEn-m, where the hop-limit is encoded in those messages.

A UDP message would be constructed asking “Who can reach X within N hops?” and sent to ff02::1 to some “well-known” port.

The first second is reserved for responses from nodes that know a route, either through Net/ROM, or maybe they’ve been in contact with that station before.  They respond something along the lines of “X via A,B,C, quality Q”, where A, B, C are digipeaters and Q is some link quality value.

Not sure how I’ll derive Q just yet.  Possibly based on packet loss… we’ll think of something.

If no responses are heard, the routers that heard the message re-broadcast it and listen for replies.  In the re-broadcast, each router appends its 48-bit 6LoWHAM address and a link quality to the message payload.  The hop limit would also get decremented.  That way, it can break cycles, and it gives a direct unicast path for the distant node to respond.

The same algorithm applies: wait a second for immediate responses, then any routers downstream append their addresses/link quality values, decrement the hop limit, and re-broadcast.

Again, any node that overhears the message (including the target node), may respond.  It does so via a direct unicast, sent using conventional AX.25 digipeating.  Any router en route that relays the message may also cache the result.  The “mesh” gets to learn of where everyone is as-required rather than by default with Net/ROM.

If the hop limit reaches zero, no further re-broadcasts are made, the message stops there.

When the source node hears the replies, each reply resets a 100msec timer.  100msec after the last reply, it chooses three “best” routes, and sends a ICMPv6 ND message via each one to the target station.  The station replies to all three back via those routes with an ICMPv6 NA.  If a message is lost via one of those routes, that route is demoted in quality.

Once replies have arrived back at the source, it picks the best route based on the updated quality information, and begins communications via that route.

Multicast routing

This, is more tricky.  I think the link-local should mean what it means on Thread… that is ff02::/16 just gets processed by immediate neighbours that are in direct RF range.

Realm-local (RFC-7346), ff03::/16 should be used for stuff that’s mesh-wide.  Those messages may be repeated by routers provided those routers have at least one subscriber for the given multicast group/port listening.

Multicast Listener Discovery looks to be the tool for that, although it could do with some 6LoWPAN-style optimisation.

I’m thinking the first time a router hears a datagram destined for a particular group, it should send a query out asking “who is listening” to the said group.

Following that first message, it should be up to the downstream node to inform the local routers that it intends to receive messages from a given group.  This should be periodic, maybe hourly, so that routers are not re-broadcasting messages for a node that has gone off-air.

Routers that have no listeners for a group, do not rebroadcast that group’s traffic.  Similarly, if the hop limit has been exhausted, the messages do not get rebroadcast.

Nov 182018
 

So today I was meant to be helping re-build a deck, but that got postponed to next weekend.  Thus, I had an extra free day I wasn’t counting on.

I wound up looking at LinBPQ in detail, to see if I can get it to run.  I downloaded the sources, and sure enough, they do compile on my x86-64 laptop, but does it work?  Not a chance.  Starts parsing the configuration file, then boompa, SEGFAULT.

I run the binary through gdb, and see this:

GNU gdb (Gentoo 8.1 p1) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/stuartl/projects/6lowham/linbpq/linbpq...done.
(gdb) r
Starting program: /home/stuartl/projects/6lowham/linbpq/linbpq 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
G8BPQ AX25 Packet Switch System Version 6.0.17.1 November 2018
Copyright � 2001-2018 John Wiseman G8BPQ
Current Directory is /var/lib/linbpq

Configuration file Preprocessor.
Using Configuration file /var/lib/linbpq/bpq32.cfg
Conversion (probably) successful


Program received signal SIGSEGV, Segmentation fault.
0x00005555555f8a7b in Start () at cMain.c:1190
1190                    *(ptr3++) = *(ptr2++);
(gdb) bt full
#0  0x00005555555f8a7b in Start () at cMain.c:1190
        cfg = 0x555555b91c40
        ptr1 = 0x555555ba60c0
        PORT = 0x5555558f6aa0 
        FULLPORT = 0x558f7928
        NEXTPORT = 0x5555558f6de0 <DATAAREA+832>
        EXTPORT = 0x7ffff6eb7953 <_IO_file_overflow+291>
        APPL = 0x5555558f49e0 
        ROUTE = 0x559085e8
        DEST = 0x870b07e2ddd5f300
        CMD = 0x5555558d79e0 
        PortSlot = 2
        ptr2 = 0x555555ba6849 "K4MSL Test station \r"
        ptr3 = 0x55912549 
        ptr4 = 0x5555558d7183 <COMMANDS+1667> "         \003"
        CWPTR = 0x5555558f6b18 <DATAAREA+120>
        i = 0
        n = 119
        int3 = 1435466024
#1  0x000055555563e35c in main (argc=1, argv=0x7fffffffe518) at LinBPQ.c:598
        i = 1
        user = 0x0
        conn = 0x7ffff7ffa298
        STAT = {st_dev = 140737354131120, st_ino = 140737488347784, st_nlink = 140737488347780, st_mode = 4160741648, 
          st_uid = 32767, st_gid = 4143745959, __pad0 = 32767, st_rdev = 140737488348192, st_size = 140737488347784, 
          st_blksize = 1700966438, st_blocks = 26577600, st_atim = {tv_sec = 140737354113688, tv_nsec = 140737488348000}, 
          st_mtim = {tv_sec = 140737354113448, tv_nsec = 140737488347780}, st_ctim = {tv_sec = 140737488347984, 
            tv_nsec = 140737354131160}, __glibc_reserved = {1, 4150715120, 0}}
        PORTVEC = 0x7ffff7ffe6b0

Ookay then… so invalid pointers, what fun!  More to the point, have a close look at the underlined addresses… I’m beginning to understand why it was called BPQ32.

The culprit for this wound up being little gems like this:

			//	Round to word boundary (for ARM5 etc)

			int3 = (int)ptr3;
			int3 += 3;
			int3 &= 0xfffffffc;
			ptr3 = (UCHAR *)int3;

There were a few other instances of this, and variations on the theme too, but one way or the other, linbpq basically assumes that all pointers are 32-bits, and so are ints.

Four hours later, I finally had something that started, but there are probably lots of landmines for anyone running the binary to inadvertently stomp on.  The code is pointer-arithmetic city!  Much of the time, code is casting pointers to unsigned int, or back again.  If I submitted code like that at work, they’d have me hauled ’round the back of the building and shot!

I’m left wondering if it’s worth getting to understand, or should I shove it in a VM, write some code based on my understanding of the protocols, do some integration testing with it, then abandon LinBPQ for something I can have confidence in.

The use and re-use of certain variables makes me wonder if the code is actually a port from the DOS-based BPQCode which was likely written in 8086 assembler.  This would make a lot of sense as to why I’m seeing the sorts of software coding patterns I’m seeing in that code.  The logic seems to have been ported to C just enough to get it to compile and work like the assembly version.

Reasonable enough… but there’s a lot of technical debt there still waiting to be paid back.  On paper, there’s a lot of benefit in using LinBPQ as the back-end, and I am thankful that John Wiseman made the decision to release the code under the GPLv3 so that I can at least investigate the possibility of using that code here.

I’ve thrown what I’ve got up on Github for now, and there’s a Gentoo overlay for installing it.  Add the overlay and run emerge linbpq, and you should find yourself with an installation of LinBPQ that just needs some OpenRC scripts and some work with an editor on /var/lib/linbpq/bpq32.cfg to get going.

If I get further on the code front, I might look at some init scripts, both OpenRC and systemd ones, then I can produce a few Debian binaries so you can run apt-get install linbpq on your Raspberry Pi and have a packet station going quickly.

Nov 172018
 

Today, I decided to get cuddly with the relevant RFCs and see if I could adapt them into something that would work for AX.25. The following roughly describes how one might stuff IPv6 datagrams into AX.25.

Much of this is heavily influenced by RFC-4944 and RFC-6282, the latter of which looks to be the heart-and-soul of Thread.


Stateless Automatic Addressing

We have a mechanism by which an AX.25 call+SSID can be losslessly mapped to a 48-bit MAC address. This is built on Radix-50 and can work as a stand-in for the EUI-48. The pseudo EUI-48 procedure mentioned in section 6 of the RFC-4944 standard is not required.

An EUI-64 is generated from an EUI-48 by chopping the EUI-48 in half and inserting the bytes ff:fe in the middle. So the EUI-48:

00:11:22:33:44:55

becomes the following EUI-64:

00:11:22:ff:fe:33:44:55

SLAAC therefore will work the same way it does for Ethernet.

Frame format

1. AX.25 UI Frame header

Size: (17 + (D*7) bytes, where D is the number of digipeaters being used

  • PID = 1100 0101 (tentative) IPv6
  • Control = 0000 0011
    • Frame type: UI, P/F = 0 (final)
  • Must contain source and destination AX.25 callsigns, may contain up to 8 digipeater AX.25 callsigns.

For a direct station-to-station contact:

  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
├───┴───┴───┴───┴───┴───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┤
│       AX.25 Flag (0x7e)       │ Destination AX.25 Call+SSID   │
├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├───────────────────────────────────────────────────────────────┤
│ Source AX.5 Call+SSID                                         │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬───────────────────────────────┤
│                               │          AX.25 PID            │
├───────────────────────────────┴───────────────────────────────┤
╎             AX.25 UI frame payload starts here                ╎
└╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘

or for contact via a few digipeaters:

  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
├───┴───┴───┴───┴───┴───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┤
│       AX.25 Flag (0x7e)       │ Destination AX.25 Call+SSID   │
├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├───────────────────────────────────────────────────────────────┤
│ Source AX.5 Call+SSID                                         │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬───────────────────────────────┤
│                               │    Digipeater 1 Call+SSID     │
├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├───────────────────────────────────────────────────────────────┤
│ Digipeater 2 Call+SSID                                        │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬───────────────────────────────┤
│                               │          AX.25 PID            │
├───────────────────────────────┴───────────────────────────────┤
╎             AX.25 UI frame payload starts here                ╎
└╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘

2. Mesh Addressing Header

To be used when two stations are not able to directly communicate, or when multicasting.

In this scenario, the AX.25 frame source and destination indicate the addresses of the directly-communicating nodes (e.g. source and digipeater, intermediate digipeaters, or digipeater and destination), and the fields given here will be the addresses of the source and destination AX.25 stations.

e.g. sending from VK4MSL-0 to VK4MDL-9 via
VK4RZB-0 and VK4RZA-0:

  1. First transmission:
    • AX.25 Src: VK4MSL-0
    • AX.25 Dst: VK4RZB-0
    • Mesh Src: VK4MSL-0
    • Mesh Dst: VK4MDL-9
    • Hops: 7
  2. Intermediate hop:
    • AX.25 Src: VK4RZB-0
    • AX.25 Dst: VK4RZA-0
    • Mesh Src: VK4MSL-0
    • Mesh Dst: VK4MDL-9
    • Hops: 6
  3. Final delivery:
    • AX.25 Src: VK4RZA-0
    • AX.25 Dst: VK4MDL-9
    • Mesh Src: VK4MSL-0
    • Mesh Dst: VK4MDL-9
    • Hops: 5

Unlike 802.15.4, we do not have 16-bit short addresses. Since these bits would otherwise always be set to 0, we will use these to provide a 6-bit “hops left” field. We shall use the value 63 (0x3f) to indicate when there are 63 or more hops remaining.

We will use the raw 48-bit addresses here. In keeping with amateur radio conventions, the source and destinations are flipped compared to RFC-4944.

Header format (13 bytes):

  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
├───┴───┼───┴───┴───┴───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┤
│ 1   0 │       Hops Left       │      Destination Address      │
├───────┴───────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬───────────────────────────────┤
│                               │         Source Address        │
├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                               │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬───────────────────────────────┤
│                               │    Remaining AX.25 Payload    │
├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
╎                                                               ╎
└╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘

4. Fragmentation header

To be used when a IPv6 datagram is greater than L bytes, where L may be defined to be between 64 and 216 bytes.

This part is identical to that of RFC-4944 (section 5.3). I’ll come back to this bit.

5. IPv6 datagram

This can be encoded in a number of ways depending on requirements:

5.1. Raw IPv6 datagram

  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
├───┴───┼───┴───┴───┴───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┤
│ 0   1 │       6LP_IPV6        │                               │
├───────┴───────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
╎                  Raw IPv6 datagram with payload.              ╎
└╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘

6LP_IPV6 is the value 0x01, as per RFC-4944. The IPv6 datagram is encoded as per RFC-2460, and includes its payload.

The AX.25 frame is finished off with the frame-check sequence.

5.2. Compressed IPv6 datagram

In this format, the datagram fields are compressed, either through making static assumptions, or by deriving them from things such as the AX.25 header, or a previously agreed-to context.

The first field in such payloads is the 6LP_IPHC field:

  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
├───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┤
│ 0   1   1 │               6LP_IPV6 with CID=1                 │
├───────────┴───────────────────┬───────────────────────────────┤
│        Context ID Byte        │                               │
├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
╎             Compressed IPv6 datagram with payload.            ╎
└╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘

or without the context ID

  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
├───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┤
│ 0   1   1 │               6LP_IPV6 with CID=0                 │
├───────────┴───────────────────────────────────────────────────┤
╎             Compressed IPv6 datagram with payload.            ╎
└╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘

The 6LP_IPHC field is a 13-bit field, optionally followed by a context ID extension byte. The bit allocations are as follows:

  0   1   2   3   4   5   6   7   8   9  10  11  12
├───┴───┼───┼───┴───┼───┼───┼───┴───┴───┼───┼───┼───┤
│  TF   │ NH│ HLIM  │CID│SAC│  SAM      │ M │DAC│DAM│
└───────┴───┴───────┴───┴───┴───────────┴───┴───┴───┘
  • (MSB) 0-1: TF Traffic Class, Flow Label. See 5.2.1 below.
  • 2: NH Next Header encoding
    • =0: Given explicitly
    • =1: Encoded using 6LP_NHC
  • 3-4: HLIM Hop Limit
    • =00: Given explicitly
    • =01: is set to 1
    • =10: is set to 64
    • =11: is set to 255
  • 5: CID Context Identifier Extension
    • =0: No CID byte follows
    • =1: A CID byte follows
  • 6-8: SAC Source Address Compression / SAM Mode
    • =000: No compression applied, whole address given
    • =001: Prefix is link-local prefix, remaining bits are given.
    • =x10: Not used in 6LoWHAM (we don’t support 16-bit addresses)
    • =011: Prefix is link-local, figure the rest out from the source address in the AX.25 header.
    • =100: Unspecified address ::
    • =101: See the context for the prefix, remaining bits are given.
    • =111: Figure out the address from the AX.25 header and context.
  • (LSB) 9-12: M Multicast, DAC Destination Address Compression
    DAM Mode

    • =0000: No compression, not multicast, whole address given
    • =0001: Prefix is link-local prefix, remaining bits are given. Not multicast.
    • =xx10: Not used in 6LoWHAM (we don’t support 16-bit addresses)
    • =0011: Prefix is link-local, figure the rest out from the destination address in the AX.25 header. Not multicast.
    • =0100: Reserved
    • =0101: See the context for the prefix, remaining bits are given. Not multicast.
    • =0111: Figure out the address from the AX.25 header and context. Not multicast.
    • =1000: No compression, multicast address, whole address given
    • =1001: 48-bits of multicast address given, fill in the blanks: ff__::00__:____:____.
    • =1010: 32-bits of multicast address given, fill in the blanks: ff__::00__:____.
    • =1011: 8-bits of multicast address given, fill in the blanks: ff02::00__.
    • =1100: 48-bits RFC-3306/RFC-3956 address, ff__:__LL:PPPP:PPPP:PPPP:PPPP:____:____ where P and L come from the context.
    • =1101: Reserved
    • =1110: Reserved
    • =1111: Reserved

The context ID extension byte has the following format:

  0   1   2   3   4   5   6   7
├───┴───┴───┴───┼───┴───┴───┴───┤
│      SCI      │      DCI      │
└───────────────┴───────────────┘
  • (MSB) 0-3: Source Context Identifier
  • (LSB) 4-7: Destination Context Identifier

These two sub-fields indicate which specific context is being used to fill in the blanks.

5.2.1: Traffic Class and Flow Label

These may be partially or completely omitted depending on the TF setting in the previous field.

  • TF=00:
      0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
    ├───┴───┼───┴───┴───┴───┴───┴───┼───┴───┴───┴───┼───┴───┴───┴───┤
    │  ECN  │         DCSP          │ 0   0   0   0 │               │
    ├───────┴───────────────────────┴───────────────┴ ─ ─ ─ ─ ─ ─ ─ ┤
    │                          Flow Label                           │
    └───────────────────────────────────────────────────────────────┘
    
  • TF=01:
      0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
    ├───┴───┼───┴───┼───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┤
    │  ECN  │ 0   0 │                                               │
    ├───────┴───────┴ ─ ─ ─ ─ ─ ─ ─ ┬───────────────────────────────┤
    │           Flow Label          │                               │
    ├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
    ╎                   Remainder of IPv6 datagram.                 ╎
    └╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘
    
  • TF=10:
      0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
    ├───┴───┼───┴───┴───┴───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┤
    │  ECN  │         DCSP          │                               │
    ├───────┴───────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
    ╎                   Remainder of IPv6 datagram.                 ╎
    └╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘
    
  • TF=11: Flow label, ECN and DCSP are set to 0.

5.2.2: Next Header

If 6LP_NHC is not explicitly enabled, the next header byte will appear next.

5.2.3: Hop Limit.

Again, if not explicitly defined in the 6LP_IPHC header, the hop-limit byte will appear next.

5.2.4: Source address

The format here is determined by the values of SAC/SAM:

  • 000: Entire IPv6 address, 16 bytes given here.
  • x01: Last 8-bytes of the address given here
  • For all other values, the source address is omitted.

5.2.5: Destination address

The format here is determined by the values of M/DAC/DAM:

  • x000: Entire IPv6 address, 16 bytes given here.
  • 0x01: Last 8-bytes of the address given here.
  • 1001: 6-bytes of address given here, fill-in-the-blanks.
  • 1010: 4-bytes of address given here, fill-in-the-blanks.
  • 1011: Last byte of address given here, fill-in-the-blank.
  • 1100: 6-bytes of address given here, fill-in-the-blanks.
  • For all other values, the destination address is omitted.

6. 6LoWPAN Next Header

This is used to encode selected IPv6 extensions or L4 protocol headers.

6.1. IPv6 extension headers

A select number of IPv6 extensions may be encoded by replacing the usual “Next Header” byte with the following:

  0   1   2   3   4   5   6   7
├───┴───┴───┴───┼───┴───┴───┼───┤
│ 1   1   1   0 │    EID    │ N │
└───────────────┴───────────┴───┘

where EID (bits 4-6) is one of:

  • =0 IPv6 Hop-By-Hop options
  • =1 IPv6 Routing
  • =2 IPv6 Fragment
  • =3 IPv6 Destination Options
  • =4 IPv6 Mobility
  • =7 IPv6 Header

and N (bit 7) indicates whether the header’s payload is followed by another 6LowPAN Next Header, or a regular IPv6 Next Header (with its “Next Header” byte). For EID=7, N MUST be 0.

Length fields within the header payload should be counted in bytes instead of 8-byte blocks.

7. Datagram payload

7.1. Non-UDP payloads

For payloads other than UDP packets, these should be inserted into the AX.25 payload as-is following the extensions.

UDP packets with uncompressed headers should also be inserted
in this manner.

7.2. UDP payloads with header compression

For these payloads, the following UDP header should be used:

  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
├───┴───┴───┴───┴───┼───┼───┴───┼───┴───┴───┴───┴───┴───┴───┴───┤
│ 1   1   1   1   0 │ C │   P   │           Source Port         │
├───────────────────┴───┴───────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
╎                                                               ╎
├───────────────────────────────────────────────────────────────┤
╎                         Destination Port                      ╎
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
╎                                                               ╎
├───────────────────────────────────────────────────────────────┤
│                     Checksum (unless C=1)                     │
└───────────────────────────────────────────────────────────────┘
  • (MSB): bits 0-4: Compressed UDP header marker. Literal 11110₂
  • Bit 5: C Compressed UDP checksum
    • 0= UDP checksum is given (recommended value)
    • 1= UDP checksum is omitted
  • Bits 6-7: P Ports
    • 00=Both source and destination addresses are
      given in full

        0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
      ├───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┤
      │                         Source Port                           │
      ├───────────────────────────────────────────────────────────────┤
      │                      Destination Port                         │
      └───────────────────────────────────────────────────────────────┘
      
    • 01=Source port is given in full, Least significant 8-bits of destination given, destination port is 0xff00-0xffff (65280-65535)
        0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
      ├───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┤
      │                         Source Port                           │
      ├───────────────────────────────┬───────────────────────────────┤
      │       Destination Port        │                               │
      ├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
      ╎                    Remainder of UDP packet                    ╎
      └╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘
      
    • 10=Destination port is given in full, Least significant 8-bits of source given, source port is 0xff00-0xffff (65280-65535)
        0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
      ├───┴───┴───┴───┴───┴───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┤
      │          Source Port          │        Destination Port       │
      ├───────────────────────────────┼───────────────────────────────┤
      │    Destination Port (cont.)   │                               │
      ├───────────────────────────────┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
      ╎                    Remainder of UDP packet                    ╎
      └╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┘
      
    • 11=Only least significant 4-bits of source and destination ports are given. Port LSB range is 0xf0b0-0xf0bf (61616-61631)
        0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
      ├───┴───┴───┴───┴───┴───┴───┴───┼───┴───┴───┴───┴───┴───┴───┴───┤
      │         Source Port           │       Destination Port        │
      └───────────────────────────────┴───────────────────────────────┘
      

The C bit should only be set if the upper-level application asks for it. Whilst 802.15.4 does its own CRC as does AX.25, the field is mandatory in UDP and the recommendation is to only drop it if the application says it’s okay.