Sep 192021

I stumbled across this article regarding the use of TCP over sensor networks. Now, TCP has been done with AX.25 before, and generally suffers greatly from packet collisions. Apparently (I haven’t read more than the first few paragraphs of this article), implementations TCP can be tuned to improve performance in such networks, which may mean TCP can be made more practical on packet radio networks.

Prior to seeing this, I had thought 6LoWHAM would “tunnel” TCP over a conventional AX.25 connection using I-frames and S-frames to carry TCP segments with some header prepended so that multiple TCP connections between two peers can share the same AX.25 connection.

I’ve printed it out, and made a note of it here… when I get a moment I may give this a closer look. Ultimately I still think multicast communications is the way forward here: radio inherently favours one-to-many communications due to it being a shared medium, but there are definitely situations in which being able to do one-to-one communications applies; and for those, TCP isn’t a bad solution.

Comments having read the article

So, I had a read through it. The take-aways seem to be this:

  • TCP was historically seen as “too heavy” because the MCUs of the day (circa 2002) lacked the RAM needed for TCP data structures. More modern MCUs have orders of magnitude more RAM (32KiB vs 512B) today, and so this is less of an issue.
    • For 6LoWHAM, intended for single-board computers running Linux, this will not be an issue.
  • A lot of early experiments with TCP over sensor networks tried to set a conservative MSS based on the actual link MTU, leading to TCP headers dominating the lower-level frame. Leaning on 6LoWPAN’s ability to fragment IP datagrams lead to much improved performance.
    • 6LoWHAM uses AX.25 which can support 256-byte frames; vs 128-byte 802.15.4 frames on 6LoWPAN. Maybe gains can be made this way, but we’re already a bit ahead on this.
  • Much of the document considered battery-powered nodes, in which the radio transceiver was powered down completely for periods of time to save power, and the effects this had on TCP communications. Optimisations were able to be made that reduced the impact of such power-down events.
    • 6LoWHAM will likely be using conventional VHF/UHF transceivers. Hand-helds often implement a “battery saver” mode — often this is configured inside the device with no external control possible (thus it will not be possible for us to control, or even detect, when the receiver is powered down). Mobile sets often do not implement this, and you do not want to frequently power-cycle a modern mobile transceiver at the sorts of rates that 802.15.4 radios get power-cycled!
  • Performance in ideal conditions favoured TCP, with the article authors managing to achieve 30% of the raw link bandwidth (75kbps of a theoretical 250kbps maximum), with the underlying hardware being fingered as a possible cause for performance issues.
    • Assuming we could manage the same percentage; that would equate to ~360bps on 1200-baud networks, or 2.88kbps on 9600-baud networks.
  • With up to 15% packet loss, TCP and CoAP (its nearest contender) can perform about the same in terms of reliability.
  • A significant factor in effective data rate is CSMA/CA. aioax25 effectively does CSMA/CA too.

Its interesting to note they didn’t try to do anything special with the TCP headers (e.g. Van Jacobson compression). I’ll have to have a look at TCP and see just how much overhead there is in a typical segment, and whether the roughly double MTU of AX.25 will help or not: the article recommends using MSS of approximately 3× the link MTU for “fair” conditions (so ~384 bytes), and 5× in “good” conditions (~640 bytes).

It’s worth noting a 256-byte AX.25 frame takes ~2 seconds to transmit on a 1200-baud link. You really don’t want to make that a habit! So smaller transmissions using UDP-based protocols may still be worthwhile in our application.

Sep 162021

So, one evening I was having difficulty sleeping, so like some people count sheep, turned to a different problem…6LoWPAN relies on all nodes sharing a common “context”. This is used as a short-hand to “compress” the rather lengthy IPv6 addresses for allowing two nodes to communicate with one another by substituting particular IPv6 address subnets with a “context number” which can be represented in 4 bits.

Fundamentally, this identifier is a stand-in for the subnet address. This was a sticking-point with earlier thoughts on 6LoWHAM: how do we agree on what the context should be? My thought was, each network should be assigned a 3-bit network ID. Why 3-bit? Well, this means we can reserve some context IDs for other uses. We use SCI/DCI values 0-7 and leave 8-15 reserved; I’ll think of a use for the other half of the contexts.

The node “group” also share a SSID; the “group” SSID. This is a SSID that receives all multicast traffic for the nodes on the immediate network. This might be just a generic MCAST-n SSID, where n is the network ID; or it could be a call-sign for a local network coordinator, e.g. I might decide my network will use VK4MSL-0 for my group SSID (network 0). Probably nodes that are listening on a custom SSID should still listen for MCAST-n traffic, in case a node is attempting to join without knowing the group SSID.

AX.25 allows for 16 SSIDs per call-sign, so what about the other 8? Well, if we have a convention that we reserve SSIDs 0-7 for groups; that leaves 8-15 for stations. This can be adjusted for local requirements where needed, and would not be enforced by the protocol.

Joining a network

How does a new joining node “discover” this network? Firstly, the first node in an area is responsible for “forming” the network — a node which “forms” a network must be manually programmed with the local subnet, group SSID and other details. Ensuring all nodes with “formation” capability for a given network is beyond the scope of 6LoWHAM.

When a node joins; at first it only knows how to talk to immediate nodes. It can use MCAST-n to talk to immediate neighbours using the fe80::/64 subnet. Anyone in earshot can potentially reply. Nodes simply need to be listening for traffic on a reserved UDP port (maybe 61631; there’s an optimisation in 6LoWPAN for 61616-61631). The joining node can ask for the network context, maybe authenticate itself if needed (using asymmetric cryptography – digital signatures, no encryption).

The other nodes presumably already know the answer, but for all nodes to reply simultaneously, would lead to a pile-up. Nodes should wait a randomised delay, and if nothing is heard in that period, they then transmit what they know of the context for the given network ID.

The context information sent back should include:

  • Group SSID
  • Subnet prefix
  • (Optional) Authentication data:
    • Public key of the forming network (joining node will need to maintain its own “trust” database)
    • Hash of all earlier data items
    • Digital signature signed with included public key

Once a node knows the context for its chosen network, it is officially “joined”.

Routing to non-local endpoints

So, a node may wish to send a message to another node that’s not directly reachable. This is, after-all, the whole point of using a routing protocol atop AX.25. If we knew a route, we could encode it in the digipeater path, and use conventional AX.25 source routing. Nodes that know a reliable route are encouraged to do exactly that. But what if you don’t know your way around?

APRS uses WIDEN-n to solve this problem: it’s a dumb broadcast, but it achieves this aim beautifully. n just stands for the number of hops, and it gets decremented with each hop. Each digipeater inserts itself into the path as it sends the frame on. APRS specs normally call for everyone to broadcast all at once, pile-up be damned. FM capture effect might help here, but I’m not sure its a good policy. Simple, but in our case, we can do a little better.

We only need to broadcast far enough to reach a node that knows a route. We’ll use ROUTE-n to stand for a digipeater that is no more than n hops away from the station listed in the AX.25 destination field. n must be greater than 0 for a message to be relayed. AX.25 2.0 limits the number of digipeaters to 8 (and 2.2 to 2!), so naturally n cannot be greater than 8.

So we’ll have a two-tier approach.

Routing from a node that knows a viable route

If a node that receives a ROUTE-n destination message, knows it has a good route that is n or less hops away from the target; it picks a randomised delay (maybe 0-5 seconds range), and if no reply is heard from another node; it relays the message: the ROUTE-n is replaced by its own SSID, followed by the required digipeater path to reach the target node.

Routing from a node that does not know a viable route

In the case where a node receives this same ROUTE-n destination message, does not know a route, and hasn’t heard anyone else relay that same message; it should pick a randomised delay (5-10 second range), and if it hasn’t heard the message relayed via a specific path in that time, should do one of the following:

If n is greater than 1:

Substitute ROUTE-n in the digipeater path with its own SSID followed by ROUTE-(n-1) then transmit the message.

If n is 1 (or 0):

Substitute ROUTE-n with its own SSID (do not append ROUTE-0) then transmit the message.

Routing multicast traffic

Discovering multicast listeners

I’ll have to research MLD (RFC-3810 / RFC-4604), but that seems the sensible way forward from here.

Relaying multicast traffic

If a node knows of downstream nodes that ordinarily rely on it to contact the sender of a multicast message, and it knows the downstream nodes are subscribers to the destination multicast group, it should wait a randomised period, and forward the message on (appending its SSID in the digipeater path) to the downstream nodes.

Application thoughts

I think I have done some thoughts on what the applications for this system may be, but the other day I was looking around for “prior art” regarding one-to-many file transfer applications.

One such system that could be employed is UFTP. Yes, it mentions encryption, but that is an optional feature (and could be useful in emcomm situations). That would enable SSTV-style file sharing to all participants within the mesh network. Its ability to be proxied also lends itself to bridging to other networks like AMPRnet, D-Star packet, DMR and other systems.