Nov 302018
 

It’s been a while since I posted about this project… I haven’t had time to do many changes, just maintaining the current system as it is keeps me busy.

One thing I noticed is that I started getting poor performance out of the solar system late last week.  This was about the time that Sydney was getting the dust storms from Broken Hill.

Last week’s battery voltages (40s moving average)

Now, being in Brisbane, I didn’t think that this was the cause, and the days were largely clear, I was a bit miffed why I was getting such poor performance.  When I checked on the solar system itself on Sunday, I was getting mixed messages looking at the LEDs on the Redarc BCDC-1225.

I thought it was actually playing up, so I tried switching over to the other solar controller to see if that was better (even if I know it’s crap), but same thing.  Neither was charging, yet I had a full 20V available at the solar terminals.  It was a clear day, I couldn’t make sense of it.  On a whim, I checked the fuses on the panels.  All fuses were intact, but one fuse holder had melted!  The fuse holders are these ones from Jaycar.  10A fuses were installed, and they were connected to the terminal blocks using a ~20mm long length of stranded wire about 6mm thick!

This should not have gotten hot.  I looked around on Mouser/RS/Element14, and came up with an order for 3 of these DIN-rail mounted fuse holders, some terminal blocks, and some 10A “midget” fuses.  I figured I’d install these one evening (when the solar was not live).

These arrived yesterday afternoon.

New fuse holders, terminal blocks, and fuses.

However, it was yesterday morning whilst I was having breakfast, I could hear a smoke alarm going off.  At first I didn’t twig to it being our smoke alarm.  I wandered downstairs and caught a whiff of something.  Not silicon, thankfully, but something had burned, and the smoke alarm above the cluster was going berserk.

I took that alarm down off the wall and shoved it it under a doonah to muffle it (seems they don’t test the functionality of the “hush” button on these things), switched the mains off and yanked the solar power.  Checking the cluster, all nodes were up, the switches were both on, there didn’t seem to be anything wrong there.  The cluster itself was fine, running happily.

My power controller was off, at first I thought this odd.  Maybe something burned out there, perhaps the 5V LDO?  A few wires sprang out of the terminal blocks.  A frequent annoyance, as the terminal blocks were not designed for CAT5e-sized wire.

By chance, I happened to run my hand along the sense cable (the unsheathed green pair of a dissected CAT5e cable) to the solar input, and noticed it got hot near the solar socket on the wall.  High current was flowing where high current was not planned for or expected, and the wire’s insulation had melted!  How that happened, I’m not quite sure.  I got some side-cutters, cut the wires at the wall-end of the patch cable and disconnected the power controller.  I’ll investigate it later.

Power controller with crispy wiring

With that rendered safe, I disconnected the mains charger from the battery and wound its float voltage back to about 12.2V, then plugged everything back in and turned everything on.  Things went fine, the solar even behaved itself (in-spite of the melty fuse holder on one panel).

Last night, I tore down the old fuse box, hacked off a length of DIN rail, and set about mounting the new holders.  I had to do away with the backing plate due to clearance issues with the holders and re-locate my isolation switch, but things went okay.

This is the installation of the fuses now:

Fuse holders installed

The re-located isolation switch has left some ugly holes, but we’ll plug those up with time (unless a friendly mud wasp does it for us).

Solar isolation switch re-located, and some holes wanting some putty.

For interest’s sake, this was the old installation, partially dismantled.

Old installation, terminal strips and fuse holders.You can see how the holders were mounted to that plate.  The holder closest to the camera has melted rather badly.  The fuse case itself also melted (but the fuse is still intact).

Melted fuse holder detail

The new holders are rated at 690V AC, 30A, and the fuses are rated to 500V, so I don’t expect to have the same problems.

As for the controller, maybe it’s time to retire that design.  The high-power DC-DC converter project ultimately is the future replacement and a first step may be to build an ATTiny24A-based controller that can poll the current shunt sensors and switch the mains charger on and off that way.

Nov 282018
 

The binutils linker is able to generate a map file when it links your binaries.  This provides a lot of detail on how the functions and variables have been arranged into the program memory space, which is crucial information when dealing with embedded devices.

Unfortunately, looking around I didn’t see any decent tools for extracting this information.  I wound up cooking my own Python script up to do this.  It’s very crude, just takes a map file on standard input, and dumps a report to standard output.  It seems to work okay with ARM, and sorta works with AVR but might need some more work.

import re
from sys import stdin, stdout

WIDTH = 8
SPARSE_SKIP = 4*WIDTH
SYMBOL_ONLY_RE = re.compile(\
        r'^ \.([a-zA-Z0-9]+)\.([a-zA-Z0-9_\.]+)$')
ADDR_ONLY_RE = re.compile(\
        r'^ {16}(0x[0-9a-f]+) +(0x[0-9a-f]+) (.*)$')
ADDR_CXXSYM_RE = re.compile(\
        r'^ {16}(0x[0-9a-f]+) {16}([a-zA-Z_][a-zA-Z0-9_:()*\[\]\.]+)$')
SYMBOL_ADDR_RE = re.compile(\
        r'^ \.([a-zA-Z0-9]+)\.([a-zA-Z0-9_\.]+) +(0x[0-9a-f]+) +(0x[0-9a-f]+) (.*)$')
FILL_RE = re.compile('^ \*fill\* +(0x[0-9a-f]+) +(0x[0-9a-f]+) +(\d+)$')
REGION_RE = re.compile('^([a-zA-Z0-9_]+) +(0x[0-9a-f]+) (0x[0-9a-f]+) ([rwx]+)$')

regions = []
last = None
objects = []

def on_symbol_only(match):
    global last
    if match:
        (section, symbol) = match.groups()
        last = {
            'type': 'symbol',
            'section': section,
            'symbol': symbol
        }
        objects.append(last)
    return match


def on_addr_only(match):
    if match:
        (address, size, loc) = match.groups()
        if last is None:
            return match

        assert last['type'] == 'symbol'
        assert 'address' not in last
        assert 'size' not in last
        assert 'loc' not in last
        last['address'] = int(address, base=16)
        last['size'] = int(size, base=16)
        last['loc'] = loc
    return match


def on_symbol_addr(match):
    global last
    if match:
        (section, symbol, address, size, loc) = match.groups()
        last = {
            'type': 'symbol',
            'section': section,
            'symbol': symbol,
            'address': int(address, base=16),
            'size': int(size, base=16),
            'loc': loc
        }
        objects.append(last)
    return match


def on_addr_cxxsym(match):
    if match:
        (address, cxxsym) = match.groups()
        if last is None:
            return match

        assert last['type'] == 'symbol'
        if last['address'] != int(address, base=16):
            return match
        if 'cxxsyms' not in last:
            last['cxxsyms'] = set()
        last['cxxsyms'].add(cxxsym)
    return match


def on_fill(match):
    if match:
        (address, size, data) = match.groups()
        objects.append({
            'type': 'fill',
            'address': int(address, base=16),
            'size': int(size, base=16),
            'data': int(data, base=16)
        })
    return match

def on_region(match):
    if match:
        (region, origin, length, attrs) = match.groups()
        regions.append({
            'region': region,
            'address': int(origin, base=16),
            'size': int(length, base=16),
            'attrs': attrs
        })
    return match


for line in stdin:
    line = line.rstrip()

    try:
        if line == 'Memory Configuration':
            break
    except:
        print ('# Failed at line %r' % line)
        raise


for line in stdin:
    line = line.rstrip()

    try:
        if line == 'Linker script and memory map':
            break
        if on_region(REGION_RE.match(line)):
            continue
    except:
        print ('# Failed at line %r' % line)
        raise


for line in stdin:
    line = line.rstrip()
    try:
        if on_symbol_only(SYMBOL_ONLY_RE.match(line)):
            continue

        if on_addr_only(ADDR_ONLY_RE.match(line)):
            continue

        if on_addr_cxxsym(ADDR_CXXSYM_RE.match(line)):
            continue

        if on_fill(FILL_RE.match(line)):
            continue

        last = None
    except:
        print ('Failure context:')
        print ('# last = %r' % last)
        print ('# line = %r' % line)
        raise

for region in regions:
    region['end'] = region['address'] + region['size']
regions.sort(key=lambda r : r['address'])

for obj in objects:
    if 'cxxsyms' in obj:
        obj['cxxsyms'] = list(sorted(obj['cxxsyms']))

    if ('address' in obj) and ('size' in obj):
        obj['end'] = obj['address'] + obj['size']

        for region in regions:
            if (obj['end'] <= region['end']) and \ (obj['address'] >= region['address']):
                obj['region'] = region['region']
                break
objects.sort(key=lambda o : o.get('address', -1))

for region in regions:
    address = region['address']
    sym_idx = 0
    row_rem = 0
    row_syms = []
    seen = set()

    region_objects = list(filter(
        lambda obj : obj.get('region') == region['region'],
        objects))
    if not region_objects:
        continue

    for obj in region_objects:
        if obj['type'] == 'symbol':
            sym = '%02d' % (sym_idx % 100)
            sym_idx += 1
        elif obj['type'] == 'fill':
            sym = '--'
        else:
            sym = '??'

        while address < obj['address']: if not row_rem: if (obj['address'] - address) > SPARSE_SKIP:
                    end = obj['address'] - (obj['address'] % SPARSE_SKIP)
                    stdout.write('\n%16s 0x%08x -- 0x%08x (%d bytes)' % (
                        region['region'], address, end, end - address))
                    address = end
                    continue

                stdout.write('\n%16s 0x%08x: ' % (region['region'], address))
                row_rem = WIDTH

            stdout.write(' ..')
            row_rem -= 1
            address += 1

        while address < obj['end']:
            if not row_rem:
                if row_syms:
                    stdout.write(' | %s\n' % row_syms.pop(0))
                else:
                    stdout.write('\n')
                stdout.write('%16s 0x%08x: ' % (region['region'], address))
                row_rem = WIDTH

            stdout.write(' %s' % sym)
            row_rem -= 1
            address += 1
            if ('symbol' in obj) and (obj['symbol'] not in seen):
                row_syms.append('%s: %s' % (sym, obj['symbol']))
                seen.add(obj['symbol'])

            if not row_rem:
                if row_syms:
                    stdout.write(' | %s\n' % row_syms.pop(0))
                else:
                    stdout.write('\n')
                stdout.write('%16s 0x%08x: ' % (region['region'], address))
                row_rem = WIDTH

    while row_rem:
        stdout.write(' ..')
        row_rem -= 1
        address += 1

    stdout.write('\n%16s 0x%08x -- 0x%08x (%d bytes)\n' % (
        region['region'], address, region['end'], region['end'] - address))
    stdout.write('%16s %d bytes remaining\n' % (
        region['region'], region['end'] - address))

Continue reading »

Nov 212018
 

Thinking about the routing problem a little more… if I wanted to do a purely “native” routing scheme not involving Net/ROM routing update broadcasts, one has to wonder what such a system would look like.

Net/ROM L3 is really just intended to “bootstrap” things… there’s the prospect of using Net/ROM L4 for tunnelling TCP traffic, but really it’s the L3 part that interests me as a way of hopping between fragments of the mesh that may be linkable via a non-6LoWHAM capable digipeater.

Net/ROM’s periodic broadcasts are inefficient, divulging a node’s entire routing table is not an ideal situation.  So what’s the alternative?  IPv6 nodes already send a “neighbour discovery” packet when they don’t know the MAC address of a neighbour, this is a trigger for a “neighbour advertisement” response.

I’m thinking 6LoWHAM will send NAs periodically anyway.  ACMA rules require identifying every 10 minutes.  Since the NA will include the call-sign of the station (in bit-shifted ASCII), doing that every 10 minutes takes care of the ACMA requirement.  An IPv6 NA message is not a big payload.

Given this will be sent to the ff02::1 multicast group, all nodes able to hear the beaconing station will receive it.  Unlike a IEEE 802.11 or 802.3 network though, not all nodes on the mesh will hear it.

The same is true of ND messages.  If the neighbour is in ear-shot and able to respond, it likely will, but that isn’t a guarantee.  Something in the link-local scope will likely be the answer, probably a daemon listening on a UDP port and sending to the ff02::1 group.

Unicast routing

When a station wishes to make contact with a station that’s not an immediate neighbour, I’m thinking of a broadcast similar to how APRS does things.  APRS uses special call-signs WIDEn-m, where the hop-limit is encoded in those messages.

A UDP message would be constructed asking “Who can reach X within N hops?” and sent to ff02::1 to some “well-known” port.

The first second is reserved for responses from nodes that know a route, either through Net/ROM, or maybe they’ve been in contact with that station before.  They respond something along the lines of “X via A,B,C, quality Q”, where A, B, C are digipeaters and Q is some link quality value.

Not sure how I’ll derive Q just yet.  Possibly based on packet loss… we’ll think of something.

If no responses are heard, the routers that heard the message re-broadcast it and listen for replies.  In the re-broadcast, each router appends its 48-bit 6LoWHAM address and a link quality to the message payload.  The hop limit would also get decremented.  That way, it can break cycles, and it gives a direct unicast path for the distant node to respond.

The same algorithm applies: wait a second for immediate responses, then any routers downstream append their addresses/link quality values, decrement the hop limit, and re-broadcast.

Again, any node that overhears the message (including the target node), may respond.  It does so via a direct unicast, sent using conventional AX.25 digipeating.  Any router en route that relays the message may also cache the result.  The “mesh” gets to learn of where everyone is as-required rather than by default with Net/ROM.

If the hop limit reaches zero, no further re-broadcasts are made, the message stops there.

When the source node hears the replies, each reply resets a 100msec timer.  100msec after the last reply, it chooses three “best” routes, and sends a ICMPv6 ND message via each one to the target station.  The station replies to all three back via those routes with an ICMPv6 NA.  If a message is lost via one of those routes, that route is demoted in quality.

Once replies have arrived back at the source, it picks the best route based on the updated quality information, and begins communications via that route.

Multicast routing

This, is more tricky.  I think the link-local should mean what it means on Thread… that is ff02::/16 just gets processed by immediate neighbours that are in direct RF range.

Realm-local (RFC-7346), ff03::/16 should be used for stuff that’s mesh-wide.  Those messages may be repeated by routers provided those routers have at least one subscriber for the given multicast group/port listening.

Multicast Listener Discovery looks to be the tool for that, although it could do with some 6LoWPAN-style optimisation.

I’m thinking the first time a router hears a datagram destined for a particular group, it should send a query out asking “who is listening” to the said group.

Following that first message, it should be up to the downstream node to inform the local routers that it intends to receive messages from a given group.  This should be periodic, maybe hourly, so that routers are not re-broadcasting messages for a node that has gone off-air.

Routers that have no listeners for a group, do not rebroadcast that group’s traffic.  Similarly, if the hop limit has been exhausted, the messages do not get rebroadcast.

Nov 182018
 

So today I was meant to be helping re-build a deck, but that got postponed to next weekend.  Thus, I had an extra free day I wasn’t counting on.

I wound up looking at LinBPQ in detail, to see if I can get it to run.  I downloaded the sources, and sure enough, they do compile on my x86-64 laptop, but does it work?  Not a chance.  Starts parsing the configuration file, then boompa, SEGFAULT.

I run the binary through gdb, and see this:

GNU gdb (Gentoo 8.1 p1) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/stuartl/projects/6lowham/linbpq/linbpq...done.
(gdb) r
Starting program: /home/stuartl/projects/6lowham/linbpq/linbpq 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
G8BPQ AX25 Packet Switch System Version 6.0.17.1 November 2018
Copyright � 2001-2018 John Wiseman G8BPQ
Current Directory is /var/lib/linbpq

Configuration file Preprocessor.
Using Configuration file /var/lib/linbpq/bpq32.cfg
Conversion (probably) successful


Program received signal SIGSEGV, Segmentation fault.
0x00005555555f8a7b in Start () at cMain.c:1190
1190                    *(ptr3++) = *(ptr2++);
(gdb) bt full
#0  0x00005555555f8a7b in Start () at cMain.c:1190
        cfg = 0x555555b91c40
        ptr1 = 0x555555ba60c0
        PORT = 0x5555558f6aa0 
        FULLPORT = 0x558f7928
        NEXTPORT = 0x5555558f6de0 <DATAAREA+832>
        EXTPORT = 0x7ffff6eb7953 <_IO_file_overflow+291>
        APPL = 0x5555558f49e0 
        ROUTE = 0x559085e8
        DEST = 0x870b07e2ddd5f300
        CMD = 0x5555558d79e0 
        PortSlot = 2
        ptr2 = 0x555555ba6849 "K4MSL Test station \r"
        ptr3 = 0x55912549 
        ptr4 = 0x5555558d7183 <COMMANDS+1667> "         \003"
        CWPTR = 0x5555558f6b18 <DATAAREA+120>
        i = 0
        n = 119
        int3 = 1435466024
#1  0x000055555563e35c in main (argc=1, argv=0x7fffffffe518) at LinBPQ.c:598
        i = 1
        user = 0x0
        conn = 0x7ffff7ffa298
        STAT = {st_dev = 140737354131120, st_ino = 140737488347784, st_nlink = 140737488347780, st_mode = 4160741648, 
          st_uid = 32767, st_gid = 4143745959, __pad0 = 32767, st_rdev = 140737488348192, st_size = 140737488347784, 
          st_blksize = 1700966438, st_blocks = 26577600, st_atim = {tv_sec = 140737354113688, tv_nsec = 140737488348000}, 
          st_mtim = {tv_sec = 140737354113448, tv_nsec = 140737488347780}, st_ctim = {tv_sec = 140737488347984, 
            tv_nsec = 140737354131160}, __glibc_reserved = {1, 4150715120, 0}}
        PORTVEC = 0x7ffff7ffe6b0

Ookay then… so invalid pointers, what fun!  More to the point, have a close look at the underlined addresses… I’m beginning to understand why it was called BPQ32.

The culprit for this wound up being little gems like this:

			//	Round to word boundary (for ARM5 etc)

			int3 = (int)ptr3;
			int3 += 3;
			int3 &= 0xfffffffc;
			ptr3 = (UCHAR *)int3;

There were a few other instances of this, and variations on the theme too, but one way or the other, linbpq basically assumes that all pointers are 32-bits, and so are ints.

Four hours later, I finally had something that started, but there are probably lots of landmines for anyone running the binary to inadvertently stomp on.  The code is pointer-arithmetic city!  Much of the time, code is casting pointers to unsigned int, or back again.  If I submitted code like that at work, they’d have me hauled ’round the back of the building and shot!

I’m left wondering if it’s worth getting to understand, or should I shove it in a VM, write some code based on my understanding of the protocols, do some integration testing with it, then abandon LinBPQ for something I can have confidence in.

The use and re-use of certain variables makes me wonder if the code is actually a port from the DOS-based BPQCode which was likely written in 8086 assembler.  This would make a lot of sense as to why I’m seeing the sorts of software coding patterns I’m seeing in that code.  The logic seems to have been ported to C just enough to get it to compile and work like the assembly version.

Reasonable enough… but there’s a lot of technical debt there still waiting to be paid back.  On paper, there’s a lot of benefit in using LinBPQ as the back-end, and I am thankful that John Wiseman made the decision to release the code under the GPLv3 so that I can at least investigate the possibility of using that code here.

I’ve thrown what I’ve got up on Github for now, and there’s a Gentoo overlay for installing it.  Add the overlay and run emerge linbpq, and you should find yourself with an installation of LinBPQ that just needs some OpenRC scripts and some work with an editor on /var/lib/linbpq/bpq32.cfg to get going.

If I get further on the code front, I might look at some init scripts, both OpenRC and systemd ones, then I can produce a few Debian binaries so you can run apt-get install linbpq on your Raspberry Pi and have a packet station going quickly.

Nov 172018
 

Today, I decided to get cuddly with the relevant RFCs and see if I could adapt them into something that would work for AX.25.  The following roughly describes how one might stuff IPv6 datagrams into AX.25.

Much of this is heavily influenced by RFC-4944 and RFC-6282, the latter of which looks to be the heart-and-soul of Thread.


Stateless Automatic Addressing

We have a mechanism by which an AX.25 call+SSID can be losslessly mapped to a 48-bit MAC address. This is built on Radix-50 and can work as a stand-in for the EUI-48. The pseudo EUI-48 procedure mentioned in section 6 of the RFC-4944 standard is not required.

An EUI-64 is generated from an EUI-48 by chopping the EUI-48 in half and inserting the bytes ff:fe in the middle. So the EUI-48:

00:11:22:33:44:55

becomes the following EUI-64:

00:11:22:ff:fe:33:44:55

SLAAC therefore will work the same way it does for Ethernet.

Frame format

1. AX.25 UI Frame header

Size: (17 + (D*7) bytes, where D is the number of digipeaters being used

  • PID = 1100 0101 (tentative) IPv6
  • Control = 0000 0011
    • Frame type: UI, P/F = 0 (final)
  • Must contain source and destination AX.25 callsigns, may contain up to 8 digipeater AX.25 callsigns.

For a direct station-to-station contact:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
AX.25 Flag (0x7e) Destination AX.25 Call+SSID (7 bytes)
Source AX.25 Call+SSID (7 bytes)
AX.25 Control
AX.25 PID AX.25 UI payload here

or for contact via a few digipeaters:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
AX.25 Flag (0x7e) Destination AX.25 Call+SSID (7 bytes)
Source AX.25 Call+SSID (7 bytes)
Digi 1 Call+SSID
Digi 2 Call+SSID
AX.25 Control AX.25 PID AX.25 UI payload here

2. Mesh Addressing Header

To be used when two stations are not able to directly communicate, or when multicasting.

In this scenario, the AX.25 frame source and destination indicate the addresses of the directly-communicating nodes (e.g. source and digipeater, intermediate digipeaters, or digipeater and destination), and the fields given here will be the addresses of the source and destination AX.25 stations.

e.g. sending from VK4MSL-0 to VK4MDL-9 via
VK4RZB-0 and VK4RZA-0:

  1. First transmission:
    • AX.25 Src: VK4MSL-0
    • AX.25 Dst: VK4RZB-0
    • Mesh Src: VK4MSL-0
    • Mesh Dst: VK4MDL-9
    • Hops: 7
  2. Intermediate hop:
    • AX.25 Src: VK4RZB-0
    • AX.25 Dst: VK4RZA-0
    • Mesh Src: VK4MSL-0
    • Mesh Dst: VK4MDL-9
    • Hops: 6
  3. Final delivery:
    • AX.25 Src: VK4RZA-0
    • AX.25 Dst: VK4MDL-9
    • Mesh Src: VK4MSL-0
    • Mesh Dst: VK4MDL-9
    • Hops: 5

Unlike 802.15.4, we do not have 16-bit short addresses. Since these bits would otherwise always be set to 0, we will use these to provide a 6-bit “hops left” field. We shall use the value 63 (0x3f) to indicate when there are 63 or more hops remaining.

We will use the raw 48-bit addresses here. In keeping with amateur radio conventions, the source and destinations are flipped compared to RFC-4944.

Header format (13 bytes):

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
1 0 Hops Left Dest Address
Source Address
Remaining AX.25 UI payload

3. Broadcast header

The broadcast header is used with multicast messaging. In this case, the source and destination addresses in the  mesh addressing header MUST have the “group” bit set.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1 6LP_BC0 Sequence Number

4. Fragmentation header

To be used when a IPv6 datagram is greater than L bytes, where L may be defined to be between 64 and 216 bytes.

This part is identical to that of RFC-4944 (section 5.3).  I’ll come back to this bit.

5. IPv6 datagram

This can be encoded in a number of ways depending on requirements:

5.1. Raw IPv6 datagram

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 1 6LP_IPV6 IPv6 Datagram (with payload)

6LP_IPV6 is the value 0x01, as per RFC-4944. The IPv6 datagram is encoded as per RFC-2460, and includes its payload.

The AX.25 frame is finished off with the frame-check sequence.

5.2. Compressed IPv6 datagram

In this format, the datagram fields are compressed, either through making static assumptions, or by deriving them from things such as the AX.25 header, or a previously agreed-to context.

The first field in such payloads is the 6LP_IPHC field:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 1 1 6LP_IPHC with CID=1 Context ID Data follows

or without the context ID

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 1 1 6LP_IPHC with CID=0 Data follows

The 6LP_IPHC field is a 13-bit field, optionally followed by a context ID extension byte. The bit allocations are as follows:

0 1 2 3 4 5 6 7 8 9 10 11 12
TF NH HLIM CID SAC SAM M DAC DAM
  • (MSB) 0-1: TF Traffic Class, Flow Label. See 5.2.1 below.
  • 2: NH Next Header encoding
    • =0: Given explicitly
    • =1: Encoded using 6LP_NHC
  • 3-4: HLIM Hop Limit
    • =00: Given explicitly
    • =01: is set to 1
    • =10: is set to 64
    • =11: is set to 255
  • 5: CID Context Identifier Extension
    • =0: No CID byte follows
    • =1: A CID byte follows
  • 6-8: SAC Source Address Compression / SAM Mode
    • =000: No compression applied, whole address given
    • =001: Prefix is link-local prefix, remaining bits are given.
    • =x10: Not used in 6LoWHAM (we don’t support 16-bit addresses)
    • =011: Prefix is link-local, figure the rest out from the source address in the AX.25 header.
    • =100: Unspecified address ::
    • =101: See the context for the prefix, remaining bits are given.
    • =111: Figure out the address from the AX.25 header and context.
  • (LSB) 9-12: M Multicast, DAC Destination Address Compression
    DAM Mode

    • =0000: No compression, not multicast, whole address given
    • =0001: Prefix is link-local prefix, remaining bits are given. Not multicast.
    • =xx10: Not used in 6LoWHAM (we don’t support 16-bit addresses)
    • =0011: Prefix is link-local, figure the rest out from the destination address in the AX.25 header. Not multicast.
    • =0100: Reserved
    • =0101: See the context for the prefix, remaining bits are given. Not multicast.
    • =0111: Figure out the address from the AX.25 header and context. Not multicast.
    • =1000: No compression, multicast address, whole address given
    • =1001: 48-bits of multicast address given, fill in the blanks: ff__::00__:____:____.
    • =1010: 32-bits of multicast address given, fill in the blanks: ff__::00__:____.
    • =1011: 8-bits of multicast address given, fill in the blanks: ff02::00__.
    • =1100: 48-bits RFC-3306/RFC-3956 address, ff__:__LL:PPPP:PPPP:PPPP:PPPP:____:____ where P and L come from the context.
    • =1101: Reserved
    • =1110: Reserved
    • =1111: Reserved

The context ID extension byte has the following format:

0 1 2 3 4 5 6 7
SCI DCI
  • (MSB) 0-3: Source Context Identifier
  • (LSB) 4-7: Destination Context Identifier

These two sub-fields indicate which specific context is being used to fill in the blanks.

5.2.1: Traffic Class and Flow Label

These may be partially or completely omitted depending on the TF setting in the previous field.

  • TF=00:
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
    ECN DCSP 0 0 0 0 Flow Label
  • TF=01:
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
    ECN 0 0 Flow Label
  • TF=10:
    0 1 2 3 4 5 6 7
    ECN DCSP
  • TF=11: Flow label, ECN and DCSP are set to 0.

5.2.2: Next Header

If 6LP_NHC is not explicitly enabled, the next header byte will appear next.

5.2.3: Hop Limit.

Again, if not explicitly defined in the 6LP_IPHC header, the hop-limit byte will appear next.

5.2.4: Source address

The format here is determined by the values of SAC/SAM:

  • 000: Entire IPv6 address, 16 bytes given here.
  • x01: Last 8-bytes of the address given here
  • For all other values, the source address is omitted.

5.2.5: Destination address

The format here is determined by the values of M/DAC/DAM:

  • x000: Entire IPv6 address, 16 bytes given here.
  • 0x01: Last 8-bytes of the address given here.
  • 1001: 6-bytes of address given here, fill-in-the-blanks.
  • 1010: 4-bytes of address given here, fill-in-the-blanks.
  • 1011: Last byte of address given here, fill-in-the-blank.
  • 1100: 6-bytes of address given here, fill-in-the-blanks.
  • For all other values, the destination address is omitted.

6. 6LoWPAN Next Header

This is used to encode selected IPv6 extensions or L4 protocol headers.

6.1. IPv6 extension headers

A select number of IPv6 extensions may be encoded by replacing the usual “Next Header” byte with the following:

0 1 2 3 4 5 6 7
1 1 1 0 EID N

where EID (bits 4-6) is one of:

  • =0 IPv6 Hop-By-Hop options
  • =1 IPv6 Routing
  • =2 IPv6 Fragment
  • =3 IPv6 Destination Options
  • =4 IPv6 Mobility
  • =7 IPv6 Header

and N (bit 7) indicates whether the header’s payload is followed by another 6LowPAN Next Header, or a regular IPv6 Next Header (with its “Next Header” byte). For EID=7, N MUST be 0.

Length fields within the header payload should be counted in bytes instead of 8-byte blocks.

7. Datagram payload

7.1. Non-UDP payloads

For payloads other than UDP packets, these should be inserted into the AX.25 payload as-is following the extensions.

UDP packets with uncompressed headers should also be inserted
in this manner.

7.2. UDP payloads with header compression

For these payloads, the following UDP header should be used:

0 1 2 3 4 5 6 7 L-15 L-14 L-13 L-12 L-11 L-10 L-9 L-8 L-7 L-6 L-5 L-4 L-3 L-2 L-1 L
1 1 1 1 0 C P Source Address Destination Address Checksum (unless C=1)
  • (MSB): bits 0-4: Compressed UDP header marker. Literal 11110₂
  • Bit 5: C Compressed UDP checksum
    • 0= UDP checksum is given (recommended value)
    • 1= UDP checksum is omitted
  • Bits 6-7: P Ports
    • 00=Both source and destination addresses are
      given in full

      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
      Source Port (16-bits) Destination Port (16-bits)
    • 01=Source port is given in full, Least significant 8-bits of destination given, destination port is 0xff00-0xffff (65280-65535)
      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
      Source Port (16-bits) Destination Port (8-bits)
    • 10=Destination port is given in full, Least significant 8-bits of source given, source port is 0xff00-0xffff (65280-65535)
      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
      Source Port (8-bits) Destination Port (16-bits)
    • 11=Only least significant 4-bits of source and destination ports are given. Port LSB range is 0xf0b0-0xf0bf (61616-61631)
      0 1 2 3 4 5 6 7
      Source Port (4-bits) Destination Port (4-bits)

The C bit should only be set if the upper-level application asks for it. Whilst 802.15.4 does its own CRC as does AX.25, the field is mandatory in UDP and the recommendation is to only drop it if the application says it’s okay.

Nov 162018
 

I’ve been doing more pondering on the routing side of things.  The initial thought was to use Net/ROM L3 to figure out the source route of who can hear whom.  Getting access to that via BPQ’s interfaces may not be easy unless we happen to eavesdrop on the broadcasts, and of course, there’s no service discovery in BPQ.

The thought came to me, what does ICMPv6 offer me in terms of routing?  If I can ask “who has a route to node X?” or announce “I know how to reach X”… I could just skip Net/ROM L3 altogether, since there’s a good chance both will co-exist quite happily on the same AX.25 network, we just take a ship-pass-in-the-night approach.

ICMPv6 Router Advertisements basically just say “I’m a router and this is my local prefix”, similarly, ICMPv6 Router Discovery messages just ask “Who’s a router?” … not greatly helpful.

RFC-4191 gets close, specifically the Route Information Option, but this is again, targeted at reaching a node on a differing subnet to the local one, and only applies to RAs.

This is a service that 802.15.4 actually provides to 6LoWPAN (RFC-4944), so from the point of view of IPv6, a 802.15.4 network translated through 6LoWPAN “looks” like one L2 network, it just needs to know a node’s extended address which is what made me think about Net/ROM L3 in the first place.

RFC-6775 looks to enhance things a little bit, by considering the fact that it’s not just one big happy family however, not everyone can talk to each-other, and links come and go.

One thing is clear, not everybody will be a router.  Specifically, a node should definitely not advertise itself as a router unless it can hear at least two other nodes, or knows routes to nodes learned either through static configuration or via Net/ROM node advertisements.

Two nodes can just exclusively use link-layer communications, so there’s no need for either to be “routers”.  Soon as a third joins in, potentially all three could be routers if they can all hear each-other, but if you have a linear topology where only the central node can hear the other two, it is logical that that node becomes a router, and not the others.

The question is then, if one of those peripheral nodes disappears, what should the router do?  I’m thinking it should remain a router for a limited period of time (configurable, but maybe measured in hours), just in case that node returns or other nodes appear.  After some time, it may “demote” itself to non-routing node status and relinquish control of the on-mesh prefix.

Where a node promotes itself to router, if an existing on-mesh prefix is in use, it should continue to use that, otherwise it should derive a suitable ULA prefix for use.

It may also follow that a ULA is configured for certain nodes, and they are configured to remain in the router role, regardless of the number of neighbours.  Repeater sites would be prime candidates for this.  They’re in a position where they should have good coverage, and thus should be prime candidates to be routers.

Since we can do multi-hop source routing with AX.25, there’s scope to perhaps exploit this in a higher level protocol which we might build on UDP messaging, as it doesn’t look like the existing standards provide for this sort of route path all that well.  TCP/IP is after all destination routed whilst AX.25 is source routed.

I think maybe tomorrow (which is predicted to be wet), it’ll be a good day to sit down and prototype something that maybe takes care of the IP messaging side of things and at least gets two AX.25 stations exchanging messages, then we can start to build something atop that.

Nov 152018
 

Having discussed the idea with a few people, both on the linux-hams mailing list and off-list, I’m starting to formalise a few plans for how this might work.

One option is to augment existing software stacks and inter-operate not just over-the-air, but at an API level.  Brisbane WICEN have a fleet of TNCs all running TheNet X1J, which was a popular Net/ROM software stack for TAPR TNC2-compatible TNCs in the early 90s.  Slowly, these are being replaced with Raspberry Pis equipped with Pi-TNCs and running LinBPQ.

These two inter-operate quite well, and the plan looks to be, to slowly upgrade all the sites to LinBPQ nodes.

Now, 6LoWHAM on TNCs that are nearly as old as I am just isn’t going to fly, but if I can link up to LinBPQ, this alternate protocol can be packaged up and installed along-side LinBPQ in an unobtrusive manner.

There are two things I need to be able to do:

  • Send and receive raw AX.25 frames
  • Read the routing table from LinBPQ

Sending and receiving raw frames

Looking at the interfaces that LinBPQ (and BPQ32) offers, the most promising option looks to be the AGWPE-compatible interface.  The protocol is essentially a TCP link over which the AX.25 frames are encapsulated and sent.

There’s a good description of the protocol here, and looking at the sources for LinBPQ (third link from the bottom of the page), it looks as if the necessary bits of the protocol are present to send and receive raw frames.

In particular, to send raw UI frames, I need to send these as ‘M’ (direct) or ‘V’ frames (via digipeater), and to receive them, I need to make use of the monitoring mode (‘m’ frame).

Reading the routing table

This, is where things will be “fun”.  The AGWPE interface does offer a “heard” frame, which can report on what stations have been heard.  This I think isn’t going to be the holy grail I’m after, although it’ll be a start, maybe.

Alternatively, a way around this might be to “eavesdrop” on the Net/ROM routing frames.  In monitor mode, I should theoretically hear all traffic, including these Net/ROM beacons.  It’s not as nice as being able to simply read LinBPQ’s routing table, but at least I don’t have to generate the Net/ROM messages.

The other way would be to connect to the terminal interface on LinBPQ, and use the NODES command, parsing that.  Ugly, but it’ll get me by.  On that same page is NRR… which looks to be similar in function to TCP/IP’s traceroute.  The feature is also supported by JNOS 2.0, which was released in 2006.  Not old by packet radio standards, but old enough.

Identifying if a remote station supports 6LoWHAM

Now, this is the tricky bit.  Identifying an immediate neighbour is easy enough, you can simply send an ICMPv6 neighbour solicitation message and see if they respond.  In fact, I’m thinking that could be the immediate first step.  There’s no support for service discovery as such, but nodes could advertise an “alias” (just one).

The best bet may be a suck-it-and-see approach.  We should be able to “digipeat” via intermediate nodes as if they were plain L2 AX.25 digipeaters, thus if we have a reason to contact a given node (i.e. there’s unicast traffic queued up to be sent there), we can just try routing an AX.25 frame with a ICMPv6 neighbour solicitation and see if we get a neighbour advertisement.

This carries a risk though: a station may not react well to unknown traffic and may try to parse the message as something it is not.  Thus for unicast, it is not a fail-safe method.

Multicast traffic however will be a challenge, and much of IPv6 relies on multicast.  The Net/ROM station will not know anything about this, as it simply wasn’t a concept back in the day.

For subnets like ff03::1, which on Thread networks usually means “all full-function Thread devices”, this could be sent via non-6LoWHAM digipeaters by broadcasting via that digipeater to the AX.25 station alias “6LHMC” (6LoWHAM Multicast).

This could be used to provide tunnelling of multicast traffic where a route to a station has been discovered via Net/ROM and we need to safely test whether the station can in fact understand 6LoWHAM traffic without the risk of crashing it.

I think the next step might be to look at how a normal IPv6 node would “register” interest in a multicast group so that routers between it and the sender of such a group know where to forward traffic.  IPv6 does have such a mechanism, and I think understanding how multicast traverses subnets is going to be key to making this work.

Nov 132018
 

Yeah, so our illustrious Home Affairs minister, Peter Dutton has come out pushing his agenda for a “back door” to encrypted messaging applications.  How someone so naïve got to be in such a position of power, I have no idea.   Perhaps “Yes, Minister” is more of a documentary than a comedy than I’d like to imagine.

It’s not the first time a politician has suggested the idea, and each time, I wonder how much training they’ve had in things like mathematics (particularly on prime numbers, exponentiation, remainders from division: these are the building blocks for algorithms like RSA, Diffie-Hellman, etc).

Now, they’ll tell us, “We’re not banning encryption, we just want access to ${MESSAGING_APPLICATION}”.  Sure, fine… but ${MESSAGING_APPLICATION} isn’t the only one, and these days, it isn’t impossible to imagine that someone with appropriate skills can write their own secure messaging application.  The necessary components are integral to every modern web browser.  Internet routers and IP cameras, many of which have poor security and are rarely patched, provide easy means to host the server-side component of such a system freely as well as an abundance of cheap VPS hosting, and as far as ways of “obscuring the meaning” of communications, we’re spoiled for choice!

So, shut one application down, they’ll just move to another.

Then there’s the slippery slope.  After compromising maybe a dozen applications by legal force, it’s likely there’ll be laws passed to ban all encryption.  Maybe our government should talk to the International Telegraph Union and ask how their 1880s ban on codewords worked out?

The thing is, for such surveillance to work, they have to catch each and every message, and scrutinise it for alternate meanings, and such meanings may not be obvious to third parties.  Hell, my choice of words and punctuation on this very website may be a “signal” to that tells someone to dress up as Big Bird and do the Chicken Dance in the Queen Street Mall.

This post (ignoring the delivery mechanism) isn’t encrypted, but could have hidden meanings with agreed parties.  That, and modern technology provides all kinds of ways to hide data in plain sight.

Is this a photo of a funny sign, or does it have a message buried within?

Digital cameras often rely on SD cards that are formatted with the FAT file system.  This is a file system which stores files as a linked list of clusters.  These clusters can wind up being stored out-of-order, a problem known as fragmentation.  Defragmentation tools were big business in the 90s.

FAT is used because it’s simple to implement and widely supported, and on SD cards, seek times aren’t a problem so fragmentation has less of an effect on performance.

It’s not hard to conceive of a steganography technique for sharing a one-time pad which exploits this property to use some innocuous photos on a SD card, arranged in such a way so that the 4kB clusters are randomised in their distribution.  The one-time pad would be shared almost right under the noses of postal workers unnoticed, since when they plug the SD card into their computer, it’ll just show photos that look “normal”.  The one time pad would reach its destination, then could be used for secret communications that could not be broken.

So, the upshot is banning encryption will be useless because such messages can be easily hidden from view even without encryption.

Then there’s the impact of these back doors.  The private keys to these back doors had better be very very VERY secure, because everyone’s privacy depends on them.  I mean EVERYONE.  Mr. Dutton included.

Bear in mind that the movie industry tried a similar approach for securing DVDs and Bluray discs.  It failed miserably.  CSS encryption keys used on some DVDs were discovered, then it was found that CSS was weak anyway and could be trivially brute-forced.  HDCP used in Bluray also has had its secret encryption key discovered.

See, suppose a ban was imposed.  Things like this blog, okay, you’ll be hitting it over clear-text, the way it had been for a number of years… and for me to log in, I’d have to do so over plain-text HTTP.  I’d probably just update it when at home, where I can use wired Ethernet to connect to the blog.  No real security issue there.  There’s a problem of code injection for my few visitors, it’d be nice to be able to digitally “sign” the page without encrypting it to avoid that problem.  I guess if this became the reality, we’d be looking into it.

Internet banking and other “sensitive” activities would be a problem though.  I do have Internet banking now, but it’s thankfully on a separate account to my main savings, so if that got compromised, you wouldn’t get a lot of cash, however identity theft is a very real risk.

Then there’s our workplaces.  My workplace happens to do work for Defence from time to time.  They look after the energy management systems on a few SE Queensland bases: Enoggera (Gallipoli Barracks), Amberley (yours truly interrogated the Ethernet switches to draw a map of that network, which I still have a few old copies of), Canungra, Oakey, … to name a few.

We rely on encryption to keep our remote access to those sites secure.  Take that away, and we either have to do all that work “in the clear”, or send people on site.  The latter is expensive, and in some cases, the people who have clearance to step on site don’t have all the domain knowledge, so they’ll be bringing others who are not cleared and “supervising” them.

Johnny Jihadist doesn’t have to break into a defence base, they just have to look on as a contractor “logs in”.  If the electrical and water meters on a site indicate minimal usage, then maybe the barracks are empty and they can strike.  You can actually infer a lot of information from the sorts of data collected by an EMS.  A scary amount.

So our national security actually depends on civilian encryption being as strong as government encryption.  Setting up 256-bit AES with 4096-bit RSA key agreement and authentication is a few clicks and is nearly impenetrable: back-door it, and it’s worthless.

Even if you break the encryption, there’s no guarantee that you’ll be able to find the message that you’re looking for.  Or you might just wind up harassing some poor teenager that uploaded a cute but grainy kitten photo because you thought the background noise in the JPEG was some sort of coded message.

I think if we’re going to get on top of national security issues, the answer is not to spy on each other, it’s to openly talk to each other.  Get to know those around you, and accept each other’s differences.  Colonel Klink didn’t have any luck with the iron fist approach, what makes today’s ministers think they are different?

Nov 102018
 

Right now, the cluster is running happily with a Redarc BCDC-1225 solar controller, a Meanwell HEP-600C-12 acting as back-up supply, a small custom-made ATTiny24A-based power controller which manages the Meanwell charger.

The earlier purchased controller, a Powertech MP-3735 now is relegated to the function of over-discharge protection relay.  The device is many times the physical size of a VSR, and isn’t a particularly attractive device for that purpose.  I had tried it recently as a solar controller, but it’s fair to say, it’s rubbish at it.  On a good day, it struggles to keep the battery above “rock bottom” and by about 2PM, I’ll have Grafana pestering me about the battery slipping below the 12V minimum voltage threshold.

Actually, I’d dearly love t rip that Powertech controller apart and see what makes it tick (or not in this case).  It’d be an interesting study in what they did wrong to give such terrible results.

So, if I pull that out, the question is, what will prevent an over-discharge event from taking place?  First, I wish to set some criteria, namely:

  1. it must be able to sustain a continuous load of 30A
  2. it should not induce back-EMF into either the upstream supply or the downstream load when activated or activated
  3. it must disconnect before the battery reaches 10.5V (ideally it should cut off somewhere around 11-11.5V)
  4. it must not draw excessive power whilst in operation at the full load

With that in mind, I started looking at options.  One of the first places I looked was of course, Redarc.  They do have a VSR product, the VS12 which has a small relay in it, rated for 10A, so fails on (1).  I asked on their forums though, and it was suggested that for this task, a contactor, the SBI12, be used to do the actual load shedding.

Now, deep inside the heart of the SBI12 is a big electromechanical contactor.  Many moons ago, working on an electric harvester platform out at Laidley for Mulgowie Farming Company, I recall we were using these to switch the 48V supply to the traction motors in the harvester platform.  The contactors there could switch 400A and the coils were driven from a 12V 7Ah battery, which in the initial phases, were connected using spade lugs.

One day I was a little slow getting the spade lug on, so I was making-breaking-making-breaking contact.  *WHACK*… the contactor told me in no uncertain terms it was not happy with my hesitation and hit me with a nice big back-EMF spike!  I had a tingling arm for about 10 minutes.  Who knows how high that spike was… but it probably is higher than the 20V absolute maximum rating of the MIC29712s used for power regulation.  In fact, there’s a real risk they’ll happily let such a rapidly rising spike straight through to the motherboards, frying about $12000 worth of computers in the process!

Hence why I’m keen to avoid a high back-EMF.  Supposedly the SBI12 “neutralises” this … not sure how, maybe there’s a flywheel diode or MOV in there (like this), or maybe instead of just removing power in a step function, they ramp the current down over a few seconds so that the back-EMF is reduced.  So this isn’t an issue for the SBI12, but may be for other electromechanical contactors.

The other concern is the power consumption needed to keep such a beast activated.  The other factor was how much power these things need to stay actuated.  There’s an initial spike as the magnetic field ramps up and starts drawing the armature of the contactor closed, then it can drop down once contact has been made.  The figures on the SBI12 are ~600mA initially, then ~160mA when holding… give or take a bit.

I don’t expect this to be turned on frequently… my nodes currently have up-times around 172 days.  So while 600mA (7~8W at 12V nominal) is high, that’ll only be for a second at most.  Much of the current will be holding current at, let’s call it 200mA to be safe, so about 2~3W.

That 2-3W is going to be the same, whether my nodes collectively draw 10mA, 10A or 100A.

It seemed like a lot, but then I thought, what about a SSR?  You can buy a 100A DC SSR like this for a lot less money than the big contactors.  Whack a nice big heat-sink on it, and you’re set.  Well, why the heat-sink?  These things have a voltage drop and on resistance.  In the case of the Jaycar one, it’s about 350mV and the on resistance is about 7mΩ.

Suppose we were running flat chat at our predicted 30A maximum…

  • MOSFET switch voltage drop: 30A × 350mV = 10.5W
  • Ron resistance voltage drop: (30A)² × 7mΩ = 6.3W
  • Total power dissipation: 10.5W + 6.3W = 16.8W OUCH!

16.8W is basically the power of an idle compute node.  The 3W of the SBI12 isn’t looking so bad now!  But can we do better?

The function of a solid-state relay, amongst other things, is to provide electrical isolation between the control and switching components.  The two are usually galvanically isolated.  This is a feature I really don’t need, so I could reduce costs by just using a bare MOSFET.

The earlier issues I had with the body diode won’t be a problem here as there’s a definite “source” and “load”, there’ll be no current to flow out of the load back to the source to confuse some sensing circuit on the source side.  This same body diode might be an issue for dual-battery systems, as the auxiliary battery can effectively supply current to a starter motor via this body diode, but in my case, it’s strictly switching a load.

I also don’t have inductive loads on my system, so a P-channel MOSFET is an option.  One candidate for this is the Infineon AUIRFS3004-7P.  The Ron on these is supposedly in the realm of 900µΩ-1.25mΩ, and of course, being that it’s a bare MOSFET and not a SSR, there’s no voltage drop.  Thus my power dissipation at 30A is predicted to be a little over 1W.

There are others too with even smaller Ron values, but they are in teeny tiny 5mm square surface-mount packages.  The AUIRFS3004-7P looks dead-buggable, just bend up the gate pin so I can solder direct to it, and treat the others as single “pins”, then strap the sucker to a big heatsink (maybe an old PIII heatsink will do the trick).

I can either drive this MOSFET with something of my own creation, or with the aforementioned Redarc VS12.  The VS12 still does contain a (much smaller) electromechanical relay, but at 30mA (~400mW), it’s bugger all.

The question though was what else could be done?  @WIRING_SOLUTIONS suggested some units made by Victron Energy.  These do have a nice feature in that they also have over-voltage protection, and conveniently, it’s 16V, which is the maximum recommended for the MIC29712s I’m using.  They’re not badly priced, and are solid-state.

However, what’s the Ron, what’s the voltage drop?  Victron don’t know.  They tell me it’s “minimal”, but is that 100nV, 100mV, 1V?  At 30A, 100mV drop equates to 3W, on par with the SBI12.  A 500mV drop would equate to a whopping 15W!

I had a look at the suppliers for Victron Energy products, and via those, found a few other contenders such as this one by Baintech and the Projecta LVD30.  I haven’t asked about these, but again, like the Victron BatteryProtect, neither of these list a voltage drop or Ron.

There’s also this one from Jaycar, but given this is the same place that sold me the Powertech MP-3735, and sold me the original Powertech MP-3089, provided a replacement for that first one, then also replaced the replacement under RMA.  The Jaycar VSR also has practically no specs… yeah, I think I’ll pass!

Whitworths marine sell this, it might be worth looking at but the cut-out voltage is a little high, and they don’t actually give the holding current (330mA “engage” current sounds like it’s electromechanical), so no idea how much power this would dissipate either.

The power controller isn’t doing a job dissimilar to a VSR… in fact it could be repurposed as one, although I note its voltage readings seem to drift quite a lot.  I suspect this is due to the choice of 5% tolerance resistors on the voltage sensing circuit and my use of the ~1.1V internal voltage reference.  The resistors will drift a little bit, and the voltage reference can be anywhere from 1.0 to 1.2V.

Would a LM311N with good quality 1% resistors and a quality voltage reference be “better”?  Who knows?  Maybe I should try an experiment, see if I can get minimal drift out of a LM311N.  It’s either the resistors, the voltage reference, or a combination of the two that’s responsible for the power controller’s drift.

Perhaps I need to investigate which is causing the problem and see what can be done in the design to reduce it.  If I can get acceptable results, then maybe the VS12 can be dispensed with.  I may be able to do it with another ATTiny24A, or even just a simple LM311N.