December 2020

Cataloguing Chaos

So, this last 2 years, I’ve been trying to keep multiple projects on the go, then others come along and pile their own projects on top. It kinda makes a mess of one’s free time, including for things like keeping on top of where things have been put.

COVID-19 has not helped here, as it’s meant I’ve lugged a lot of gear that belongs to my workplace, or belongs at my workplace, home, to use there. This all needs tracking to ensure nothing is lost.

Years ago, I threw together a crude parts catalogue system. It was built on Django, django-mptt and PostgreSQL, and basically abused the admin part of Django to manage electronic parts storage.

I later re-purposed some of its code for an estate database for my late grandmother: I just wrote a front-end so that members of the family could be given login accounts, and “claim” certain items of the estate. In that sense, the concept was extremely powerful.

The overarching principle of how both these systems worked is that you had “items” stored within “locations”. Locations were in a tree-structure (hence django-mptt) where a location could contain further “locations”… e.g. a root-level location might be a bed room, within that might be a couple of wardrobes and draws, and there might be containers within those.

You could nest locations as deeply as you liked. In my parts database, I didn’t consider rooms, but I’d have labelled boxes like “IC Parts 1”, “IC Parts 2”, these were Plano StowAway 932 boxes… which work okay, although I’ve since discovered you don’t leave the inner boxes exposed to UV light: the plastic becomes brittle and falls apart.

The inner boxes themselves were labelled by their position within the outer box (row, column), and each “bin” inside the inner box was labelled by row and column.

IC tubes themselves were also labelled, so if I had several sitting in a box, I could identify them and their location. Some were small enough to fit inside these boxes, others were stored in large storage tubs (I have two).

If I wanted to know where I had put some LM311 op-amps, I might look up the database and it’d tell me that there were 3 of them in IC Box 1/Row 2/Row 3/Column 5. If luck was on my side, I’d go to that box, pull out the inner box, open it up and find what I was looking for plugged into some anti-static foam or stashed in a small IC tube.

The parts themselves were fairly basic, just a description, a link to a data sheet, and some other particulars. I’d then have a separate table that recorded how many of each part was present, and in which location.

So from the locations perspective, it did everything I wanted, but parametric search was out of the question.

The place here looks like a tip now, so I really do need to get on top of what I have, so much so I’m telling people no more projects until I get on top of what I have now.

Other solutions exist. OpenERP had a warehouse inventory module, and I suspect Odoo continues this, but it’s a bit of a beast to try and figure out and it seems customisation has been significantly curtailed from the OpenERP days.

PartKeepr (if you can tolerate deliberate bad spelling) is another option. It seems to have very good parametric search of parts, but one downside is that it has a flat view of locations. There’s a proposal to enhance this, but it’s been languishing for 4 years now.

VRT used to have a semi-active track-and-trace business built on a tracking software package called P-Trak. P-Trak had some nice ideas (including a surprisingly modern message-passing back-end, even if it was a proprietary one), but is overkill of my needs, and it’s a pain to try and deploy, even if I was licensed to do so.

That doesn’t mean though I can’t borrow some ideas from it. It integrated barcode scanners as part of the user interface, something these open-source part inventory packages seem to overlook. I don’t have a dedicated barcode scanner, but I do have a phone with a camera, and a webcam on my netbook. Libraries exist to do this from a web browser, such as this one for QR codes.

My big problem right now is the need to do a stock-take to see what I’ve still got, and what I’ve added since then, along with where it has gone. I’ve got a lot of “random boxes” now which are unlabelled, and just have random items thrown in due to lack-of-time. It’s likely those items won’t remain there either. I need some frictionless way to record where things are getting put. It doesn’t matter exactly where something gets put, just so long as I record that information for use later. If something is going to move to a new location, I want to be able to record that with as little fuss as possible.

So the thinking is this:

  • Print labels for all my storage locations with UUIDs stored as barcodes
  • Enter those storage locations into a database using the UUIDs allocated
  • Expand (or re-write) my parts catalogue database to handle these UUIDs:
    • adding new locations (e.g. when a consignment comes in)
    • recording movement of containers between parent locations
    • sub-dividing locations (e.g. recording the content of a consignment)
    • (partial and complete) merging locations (e.g. picking parts from stock into a project-specific container)

The first step on this journey is to catalogue the storage containers I have now. Some are already entered into the old system, so I’ve grabbed a snapshot of that and can pick through it. Others are new boxes that have arrived since, and had additional things thrown in.

I looked at ways I could label the boxes. Previously that was a spirit pen hand-writing a label, but this does not scale. If I’m to do things efficiently, then a barcode seems the logical way to go since it uses what I already have.

Something new comes in? Put a barcode on the box, scan it, enter it into the system as a new location, then mark where that box is being stored by scanning the location barcode where I’ll put the box. Later, I’ll grab the box, open it up, and I might repeat the process with any IC tubes or packets of parts inside, marking them as being present inside that box.

Need something? Look up where it is, then “check it out” into my work area… now, ideally when I’m finished, it should go back there, but if I’m in a hurry, I just throw it in a box, somewhere, then record that I put it there. Next time I need it, I can look up where it is. Logical order isn’t needed up front, and can come later.

So, step 1 is to label all the locations. Since I’m doing this before the database is fully worked-out, I want to avoid ID clashes, I’m using UUIDs to label all the locations. Initially I thought of QR codes, but then realised some of the “locations” are DIP IC storage tubes, which do not permit large square labels. I did some experiments with Code-128, but found it was near impossible to reliably encode a UUID that way, my phone had difficulty recognising the entire barcode.

I returned to the idea of QR-codes, and found that my phone will scan a 10mm×10mm QR code printed on a page. That’s about the right height for the side of an IC tube. We had some inkjet labels kicking around, small 38.1×21.2mm labels arranged in a 5×11 grid (Avery J8651/L7651 layout). Could I make a script that generated a page full of QR codes?

Turns out, pylabels will do this. It is built on reportlab which amongst other things, embeds a barcode generator that supports various symbologies including QR codes. @hugohadfield had contributed a pull request which demonstrated using this tool with QR codes. I just had to tweak this for my needs.

# This file is part of pylabels, a Python library to create PDFs for printing
# labels.
# Copyright (C) 2012, 2013, 2014 Blair Bonnett
#
# pylabels is free software: you can redistribute it and/or modify it under the
# terms of the GNU General Public License as published by the Free Software
# Foundation, either version 3 of the License, or (at your option) any later
# version.
#
# pylabels is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# pylabels.  If not, see <http://www.gnu.org/licenses/>.

import uuid

import labels
from reportlab.graphics.barcode import qr
from reportlab.lib.units import mm

# Create an A4 portrait (210mm x 297mm) sheets with 5 columns and 13 rows of
# labels. Each label is 38.1mm x 21.2mm with a 2mm rounded corner. The margins
# are automatically calculated.
specs = labels.Specification(210, 297, 5, 13, 38.1, 21.2, corner_radius=2,
        left_margin=6.7, right_margin=3, top_margin=10.7, bottom_margin=10.7)

def draw_label(label, width, height, obj):
    size = 12 * mm
    label.add(qr.QrCodeWidget(
            str(uuid.uuid4()),
            barHeight=height, barWidth=size, barBorder=2))

# Create the sheet.
sheet = labels.Sheet(specs, draw_label, border=True)

sheet.add_labels(range(1, 66))

# Save the file and we are done.
sheet.save('basic.pdf')
print("{0:d} label(s) output on {1:d} page(s).".format(sheet.label_count, sheet.page_count))

The alignment is slightly off, but not severely. I’ll fine tune it later. I’m already through about 30 of those labels. It’s enough to get me started.

For the larger J8165 2×4 sheets, the following specs work. (I can see this being a database table!)

# Specifications for Avery J8165 2×4 99.1×67.7mm
specs = labels.Specification(210, 297, 2, 4, 99.1, 67.7, corner_radius=3,
        left_margin=5.5, right_margin=4.5, top_margin=13.5, bottom_margin=12.5)

Later when I get the database ready (standing up a new VM to host the database and writing the code) I can enter this information in and get back on top of my inventory once again.

Re-building a Logitech Headset – Part III

So, a while back I tore apart an old Logitech wireless headset with the intention of using its bits to make a wireless USB audio interface. I was undecided whether the headset circuitry would “live” in a new headset, or whether it’d be a separate unit to which I could attach any headset.

I ended up doing the latter. I found through Mouser a suitable enclosure for the original circuitry and have fitted it with cable glands and sockets for the charger input (which now sports a standard barrel jack) and a DIN-5 connector for the earpiece/microphone connections.

The first thing to do was to get rid of that proprietary power connector. The two outer contacts are the +6V and 0V pins, shown here in orange and white/orange coloured cable respectively. I used a blob of heat-melt glue to secure it so I didn’t rip pads off.

Replacing the power connector. +6V is orange, 0V is orange/white.

The socket is “illuminated” by a LED on the PCB. Maybe I’ll look at some sort of light-pipe arrangement to bring that outside, we’ll see.

The other end, just got wired to a plain barrel jack. Future improvement might be to put a 6V DC-DC converter, allowing me to plug in any old 12V source, but for now, this’ll do. I just have to remember to watch what lead I grab. Whilst I was there, I also put in a cable gland for the audio interface connection.

Power socket and audio connections mounted in case.

One challenge with the board design is that there is not one antenna, but two, plus some rather lumpy tantalum capacitors near the second antenna. I suspect the two antennas are for handling polarisation, which will shift as you move your head and as the signal propagates. Either way, they meant the PCB wouldn’t sit “flat”. No problem, I had some old cardboard boxes which provided the solution:

PCB spacer, with cut-out for high-clearance parts.

The cardboard is a good option since it’s readily available and won’t attenuate the 2.4GHz signal much. It was also easy to work with.

I haven’t exposed the three push-buttons on that side of the PCB at this stage. I suppose drilling a hole and making a small “poker” to hit the buttons isn’t out of the question. This isn’t much different to what Logitech’s original case did. I’ll tackle that later. I need a similar solution for the slide-switch used for power.

One issue I faced was wrangling the now over-length FFC that linked the two sides. Previously, this spanned the headband, but now it only needed to reach a few centimetres at most. Eyeballing the original cable, I found this short replacement. I’ll have to figure out how to mount that floating PCB somehow, but at least it’s a clean solution.

Replacement FFC.

At this point, it was a case of finish wiring everything up. I haven’t tried any audio as yet, that will come in time. It still powers up, sees the transceiver, so there’s still “life” in this.

Powering up post-surgery.

I plugged it into its charger and let it run for a while just to top the LiPo cell.

Charging for the first time since mounting.

One thing I’m not happy with is the angle the battery is sitting at, since it’s just a bit wider than the space between the mounting posts. I might try shaving some material off the posts to see if I can get the battery to sit “flat”. I only need about 1mm, which should still allow enough clearance for the screwdriver and screw to pass the cell safely.

The polarity of the speakers is a guess on my part. Neither end seemed to be grounded, hopefully the drivers don’t mind being “common-ed”, otherwise I might need to cram some small isolation transformers in there.

Replacing the battery in a Hema HX-1

So, some years back, my father purchased a Hema HX-1 GPS navigator to replace an older Hema unit… and for the past few years, the unit itself has performed just fine.

Then, about a month ago, he goes out to the car, and sees that the GPS has attempted a partial deconstruction. The LiPo cell inside had expanded, partially popping the case open. The unit was unresponsive to external power input, or really, any input.

Since the job of opening the thing was half-done, the least I could do is finish the job and open it up completely to survey the damage.

I didn’t take any photos of the GPS at this point. Apparently, this failure mode is very common with this model of GPS. First thought for me was getting that battery out! I started by disconnecting it from the PCB.

When disconnecting the battery, cut ONE WIRE AT A TIME! A LiPo cell, even a damaged one like this one, can easily generate enough current to spot-weld a set of side-cutters shut. You can also trigger overheating of the already damaged cell leading to possible explosion of the cell.

I cut the wires close to the cell so I’d have some wire to work with should I need it. I didn’t like my chances of accidentally shorting something in the process, so I figured by leaving the leads out of the dud cell as short as possible, I’m less likely to have them short each-other. The battery is a fairly standard 3.7V 5Ah single-cell LiPo battery with built-in thermister.

Next step was to remove the battery. This is tricky because the battery is stuck-down with double-sided tape, and the case makes accessing the underside difficult. You do NOT want to bend the cell, as that will cause fireworks too. Also, do NOT use metal implements to shift the battery, one puncture and there’ll be fireworks.

I found a semi-flexible plastic divider from a storage box worked: I just wedged it in under the cell, then moved it side-to-side to slowly “cut” the double-sided tape. You want something that is fairly flat, stiff, but with some “give” in it. Also, you’ll want some patience, this will be a slow process.

The old battery, removed from the GPS.

The code 2016ABAD didn’t escape my attention, maybe this cell’s credentials were dodgy from the start? Anyway. You can see how close I cut the output leads: I don’t want anyone ever using this cell again, and this is the easiest way to discourage it.

Next, I needed to see if the GPS was still alive. So, I stripped the ends of the old battery leads, spliced on a bit of extra cable, and hooked it to my bench supply set at 3.6V. A little current, and the GPS eventually woke up, moaning bitterly about a “low” battery, but the thing is, it worked, so it was worth proceeding.

For the replacement battery, I measured up the old one and found this 6Ah 3.7V cell to be a good match: it was almost the exact same size, featured a thermister the same as the old one, and was about the same capacity. If your cell lacks a thermister, you might have to bodge a 10k NTC in somehow. The pin-out of this particular cell though, pretty much exactly matched the pads on the PCB:

Existing battery connections.

Now, I could just take any suitable cell, cut the connector off (again, one wire at a time!) and solder it to the PCB, but I didn’t like my chances of accidentally bridging connections.

The cell I was buying had a JST-style connector, so I figured I’d ask about where I could get a mating socket. Turns out, Core Electronics have those too. The idea was I was going to mount this to the pads, then simply plug the cell in. I found I could simply bend the pins on this connector so that instead of exiting at a right-angle, they continued straight, making it very easy to simply “tack”-solder the connector to the old pads.

I just needed to ensure I got the connector around the right way!

Installing a battery connector.

Here’s a close-up without the annotations.

Close up of installed connector.

The install looked clean, so I tried firing up. No dice, it didn’t respond to the power button. Okay, maybe the battery is a little low, let’s try applying power. The GPS went into a boot-loop, seemingly unable to complete a full boot sequence. I left it that way for a few hours, but it didn’t improve.

I measured the cell voltage at about 3.8V… I figured maybe the charging circuitry just needed a helping hand. I had bought a LiPo charger a few weeks before, a clearance item, but not used it until now. I made up a pig-tail to mate with the cell’s connector.

LiPo charger with pigtail

I then unplugged the battery from the GPS, and plugged it into the charger. The POWER LED illuminated the moment the battery was connected, and once I fished around for a 5V 1A PSU (had a 2A one in the junk box), plugged that in, the STATUS LED illuminated. Reading the data sheet, that meant it was charging.

I left it run for a few hours then checked again, the STATUS LED had gone dark, indicating charging was complete. So I powered off the charger, unplugged the battery and put it back in the GPS. Held down the power button, and… SUCCESS! It booted straight up, and reported about 80% battery capacity (okay, whatever… at least it turned on).

Now, all was not completely rosy… in the opening and shutting of the GPS, I managed to tear off the coax from an antenna. I wasn’t sure whether it was for GPS reception or WiFi at this point, the antenna is a stick-on passive type, which is unusual for GPS, but a likely candidate for WiFi.

At this point, I had no GPS fix, so I did have concerns about that. I left the GPS for a bit, and over time, it eventually acquired a satellite fix, which might just be down to the GPS being powered down for over a month.

The old antenna and its broken coax feed.

Either way, I should try fixing that before I close it up. At first I was going to try soldering to the antenna, but that proved impossible, so I peeled off the old one, and set about replacing it outright. GPS reception didn’t seem too bad, but I did notice WiFi was down to about 1 bar of reception.

The coax is some of the tiniest cable I’ve seen, so I decided I’d just replace that too. The case design does not make accessing the solder pads easy, and I’m probably in need of a new tip on my soldering iron. Removing the old coax was easy enough… soldering to those pads was a pain.

My thought was to make a simple “dipole” antenna with some RG-195 coax, stripping off its jacket to reduce the outer diameter, putting a small length of heat-shrink to insulate part of the braid, then folding the braid back over the heat-shrink-insulated section to form a “sleeve” balun. I had thought I’d just tack the coax to the PCB, but this proved a nightmare, so I wound up using copper enamel wire to bridge from the pads to my antenna.

New antenna, connected to PCB using copper enamel wire

This is not ideal, as the wire will be inductive, but thankfully it’s short, so I’ll get away with it. You’ll notice there’s a small plastic “poker” that presses the power button on the PCB: I had to cut away a little bit of plastic there to allow the coax to pass underneath. I then stuck the other end of the antenna down with electrical tape. VSWR at 2.4GHz is probably horrible, but it seems to work anyway:

WiFi, seemingly working okay with the homebrew antenna

At this point, the battery was in, the antenna was fixed, and it was time to button it all up. I used some electrical tape to help keep the battery in place, figuring that with the tight tolerances in there it likely wouldn’t move far anyway.

GPS took a little while to “wake up”, having been disconnected from power for nearly a month, but eventually it picked up the satellites and had a reasonably accurate fix. given I was indoors I’m happy with that.

GPS status screen, showing a ~5m accuracy on the fix from 7 satellites
OziExplorer (Android) showing my current position.

Cost of popularity: when search engines eat your Internet quota

Well, this month has been a funny one. When we moved to the NBN back in March, we went from having a 500GB a month quota, to a 100GB a month, with a link speed of 50Mbps.

That seemed, at the time, like a reasonable compromise, since much of the time, my typical usage has been around 60~70GB a month. There’s no Netflicks subscriptions here, but my father does hit YouTube rather hard, and I lately have been downloading music (legally) from time to time.

This year has also seen me working from home, and doing a lot of Slack and Zoom calls. Zoom in particular, is pricey quota-wise, since everyone insists on running webcams. Despite this, the extra Internet use has been manageable. Couple of times we got around 90GB, maybe sailing close to the 100GB, but never over. This is what it looked like last month:

November’s Internet quota usage

This month, that changed:

Internet usage this month

Now, the start of the month data got missed because of a glitch between collectd and the Internode quota monitoring script I have. Two of the spikes can be attributed to:

  • the arrival of a Windows 10-based laptop doing its out-of-box updates (~4GB)
  • my desktop doing its 3-monthly OS updates (~5GB)

That isn’t enough to account for why things have nearly doubled though. A few prospects were in my mind:

  • a web-based script going haywire in a browser (this has happened, and cost me dearly, before)
  • genuine local user Internet activity increases
  • website traffic increases
  • server or workstation compromise

Looking over the netflow data

Now, last time I had this happen, I did two things:

  • I set up collectd/influxdb/Grafana to be able to monitor my Internet usage and quota
  • I set up nfcapd on the border router to monitor my usage

This is pretty easy to set up in OpenBSD, and well worth doing.

I keep about 30 days’ worth of netflow data on the border router. So naturally, I haul that back to my workstation and run nfdump over it to see what jumps out.

Looking through the list of “flows”, one target identified was a development machine hosted at Vultr… checking the IP address, revealed it was one of the WideSky test instances my workplace uses, about 5GB of HTTP requests and about 4GB of VPN traffic — admittedly the couple of WideSky hubs I have here have the logging settings cranked high.

That though doesn’t explain it. The bulk of the traffic was scattered amongst a number of hosts. I didn’t see it until I tried aggregating it by /16 subnet:

RC=0 stuartl@rikishi /tmp $ nfdump -R /tmp/nfcapd -A srcip,dstip -o long6 -O bytes 'net 114.119.0.0/16'  
Date first seen          Duration Proto                             Src IP Addr:Port                                 Dst IP Addr:Port     Flags Tos  Packets    Bytes Flows
2020-11-27 23:11:30.000 1630599.000 0                             150.101.176.226:0     ->                         114.119.146.185:0     ........   0    4.7 M    6.8 G  2535
2020-11-22 13:02:41.000 2099541.000 0                             150.101.176.226:0     ->                         114.119.133.234:0     ........   0    4.3 M    6.1 G  2376
2020-11-18 14:38:42.000 2439079.000 0                             150.101.176.226:0     ->                         114.119.140.107:0     ........   0    3.8 M    5.4 G  2418
2020-11-20 10:43:58.000 2280070.000 0                             150.101.176.226:0     ->                          114.119.141.52:0     ........   0    3.7 M    5.3 G  2421
2020-11-21 22:34:35.000 2151244.000 0                             150.101.176.226:0     ->                         114.119.159.109:0     ........   0    3.4 M    4.9 G  2446
2020-11-24 00:11:52.000 1972657.000 0                             150.101.176.226:0     ->                          114.119.136.13:0     ........   0    3.4 M    4.8 G  2399
2020-11-25 04:24:32.000 1870854.000 0                             150.101.176.226:0     ->                         114.119.136.215:0     ........   0    3.3 M    4.8 G  2473
2020-11-24 15:49:55.000 1916848.000 0                             150.101.176.226:0     ->                           114.119.151.0:0     ........   0    3.0 M    4.4 G  2435
2020-11-27 20:15:43.000 1641316.000 0                             150.101.176.226:0     ->                         114.119.129.181:0     ........   0    2.6 M    3.7 G  2426
2020-11-27 21:38:37.000 1636635.000 0                             150.101.176.226:0     ->                          114.119.159.16:0     ........   0    2.5 M    3.6 G  2419
2020-11-27 23:11:30.000 1630599.000 0                             114.119.146.185:0     ->                         150.101.176.226:0     ........   0    4.1 M  175.9 M  2535
…
2020-11-19 22:02:04.000     0.000 0                             150.101.176.226:0     ->                         114.119.138.111:0     ........   0        3      132     1
2020-11-25 03:37:11.000     0.000 0                             150.101.176.226:0     ->                          114.119.152.27:0     ........   0        3      132     1
2020-12-06 19:59:49.000     0.000 0                             150.101.176.226:0     ->                         114.119.151.153:0     ........   0        3      132     1
2020-11-22 08:23:11.000     0.000 0                             150.101.176.226:0     ->                          114.119.130.23:0     ........   0        3      132     1
2020-11-25 15:43:47.000     0.000 0                             150.101.176.226:0     ->                         114.119.128.219:0     ........   0        3      132     1
2020-11-24 09:05:13.000     0.000 0                             150.101.176.226:0     ->                          114.119.140.85:0     ........   0        3      132     1
Summary: total flows: 56059, total bytes: 51.7 G, total packets: 65.0 M, avg bps: 150213, avg pps: 23, avg bpp: 794
Time window: 2020-11-13 11:01:52 - 2020-12-16 20:19:41
Total flows processed: 39077053, Blocks skipped: 0, Bytes read: 2698309352
Sys: 3.744s flows/second: 10436251.9 Wall: 15.108s flows/second: 2586482.6 

51.7GB in a month!!! Drilling further, I noted it was mostly targeted at TCP ports 80 and 443, and UDP port 53. Web traffic, in other words. Reverse look-up on a randomly selected IP showed the reverse pointer petalbot-xxx-xxx-xxx-xxx.aspiegel.com, and indeed, in server logs for various sites I host, I saw PetalBot in the user agent.

Plucking some petals off PetalBot

So, I needed to put the brakes on this somehow. I’m fine with them indexing my site, just they should have some consideration and restraint about how quickly they do so.

Thus, I amended pf.conf:

# Rate-limited "friends"
ratelimit_dst4="{ 114.119.0.0/16 }"
#ratelimit_dst6="{ }"

# Traffic shaping queues
queue root on $external  bandwidth 25M max 25M
queue slow parent root   bandwidth 256K max 512K
queue bulk parent root   bandwidth 25M default

# …

# Rate-limit certain targets
pass out on egress proto { tcp, udp, icmp } from any to $ratelimit_dst4 modulate state (pflow) set queue slow
#pass out on egress proto { tcp, udp, icmp6 } from any to $ratelimit_dst6 modulate state (pflow) set queue slow

So, the first line defines the root queue on my external interface, and sets the upload bandwidth for 25Mbps (next month, I will be dropping my speed to 25Mbps in favour of an “unlimited” quota).

Then, I define a queue which is restricted to 256kbps (peak 512kbps), and define all traffic going to a specific list of networks, to use that queue. PetalBot should now see a mere 512kbps at most from this end, which should severely crimp how quickly it can guzzle my quota, whilst still permitting it to index my site.

Yesterday, PetalBot chewed through 8GB… let’s see what it does tomorrow.

Thoughts on a forward-erasure-coded optical disc filesystem

So, in the last 12 months or so, I’ve grown my music collection in a big way. Basically over the Christmas – New Year break, I was stuck at home, coughing and spluttering due to the bushfire smoke in the area (and yes, I realise it was no where near as bad in Brisbane as it was in other parts of the country).

I spent a lot of time listening to the radio, and one of the local radio stations was doing a “25 years in 25 days” feature, covering many iconic tracks from the latter part of last decade. Now, I’ve always been a big music listener. Admittedly, I’m very much a music luddite, with the vast majority of my music spanning 1965~1995… with some spill over as far back as 1955 and going as forward as 2005 (maybe slightly further).

Trouble is, I’m not overly familiar with the names, and the moment I walk into a music shop, I’m like the hungry patron walking into a food court: I want to eat something, but what? My mind goes blank as my mind is bombarded with all kinds of possibilities.

So when this count-down appeared on the radio, naturally, I found myself looking up the play list, and I came away with a long “shopping list” of songs I’d look for. Since then, a decent amount has been obtained as CDs from the likes of Amazon and Sanity… however, for some songs, I found it was easiest to obtain them as a digital download in FLAC format.

Now, for me, my music is a long-term investment. An investment that transcends changes in media formats. I do agree with ensuring that the “creators” of these works are suitably compensated for their efforts, but I do not agree with paying for the same thing multiple times.

A few people have had to perform in a studio (or on stage), someone’s had to collect the recordings, mix them, work with the creators to assemble those into an album, work with other creative people to come up with cover art, marketing… all that costs money, and I’m happy to contribute to that. The rest is simply an act of duplication: and yes, that has a cost, but it’s minimal and highly automated compared to the process of creating the initial work in the first place.

To me, the physical media represents one “license”, to perform that work, in private, on one device. Even if I make a few million copies myself, so long as I only play one of those copies at a time, I am keeping in the spirit of that license.

Thus, I work on the principle of keeping an “archival” copy, from which I can derive working copies that get day-to-day playback. The day-to-day copy will be in some lossy format for convenience.

A decade ago that was MP3, but due to licensing issues, that became awkward, so I switched over to Ogg/Vorbis, which also reduced the storage requirements by 40% whilst not having much audible impact on the sound quality (if anything, it improved). Since I also had to ditch the illegally downloaded MP3s in the process, that also had a “cleaning” effect: I insisted then on that I have a “license” for each song after that, whether that be wax cylinder, tape reel, 8-track, cassette tape, vinyl record, CD, whatever.

This year saw the first time I returned to music downloads, but this time, downloading legally purchased FLAC files. This leads to an interesting problem, how do you store these files in a manner that will last?

Audio archiving and CDs

I’m far from the first person with this problem, and the problem isn’t specific to audio. The archiving business is big money, and sometimes it does go wrong, whether it be old media being re-purposed (e.g. old tapes of “The Goon Show” being re-recorded with other material by the BBC), destruction (e.g. Universal Studios fire), or just old fashioned media degredation.

The procedure for film-based media (whether it be optical film, or magnetic media) usually involves temperature and humidity control, along with periodic inspection. Time-consuming, expensive, error prone.

CDs are reasonably resilient, particularly proper audio CDs made to the Red Book audio disc standard. In the CD-DA standard, uncompressed PCM audio is Reed Solomon encoded to achieve forward error correction of the PCM data. Thus, if a minor surface defect develops on the media, there is hopefully enough data intact to recover the audio samples and play on as if nothing had happened.

The fact that one can take a disc purchased decades ago, and still play it, is testament to this design feature.

I’m not sure what features exist in DVDs along the same lines. While there is the “video object” container format, the purpose of this seems to be more about copyright protection than about resiliency of the content.

Much of the above applies to pressed media. Recordable media (CD-Rs) sadly isn’t as resilient. In particular, the quality of blanks varies, with some able to withstand years of abuse, and others degrading after 18 months. Notably, the dye fades, and so you start to experience data loss beginning with the edge of the disc.

This works great for stuff I’ve purchased on CDs. Vinyl records if looked after, will also age well, although it’d be nice to have a CD back-up in case my record player packs it in. However, this presents a problem for my digital downloads.

At the moment, my strategy is to download the files to a directory, save a copy of the email receipt with them, place my GPG public key along-side, take SHA-256 hashes of all of the files, then digitally sign the hashes. I then place a copy on an old 1TB HDD, and burn a copy to CD-R or DVD-R. This will get me by for the next few years, but I’ve been “burned” by recordable media failing, and HDDs are not infallible either.

Getting discs pressed only makes sense when you need thousands of copies. I just need one or two. So I need some media that will last the distance, but can be produced in small quantities at home from readily available blanks.

Archiving formats

So, there are a few options out there for archival storage. Let’s consider a few:

Magnetic tape

Professional outfits seem to work on tape storage. Magnetic media, with all the overheads that implies. The newest drive in the house is a DDS-2 DAT drive, the media for which has not been produced in years, so that’s a lame duck. LTO is the new kid on the block, and LTO-6 drives are pricey!

Magneto-Optical

MO drives are another option from the past… we do have a 5¼” SCSI MO drive sitting in the cupboard, which takes 2GB cartridges, but where do you get the media from? Moreover, what do I do when this unit croaks (if it hasn’t already)?

Flash

Flash media sounds tempting, but then one must remember how flash works. It’s a capacitor on the gate of a MOSFET, storing a charge. The dielectric material around this capacitor has a finite resistance, which will cause “leakage” of the charge, meaning over time, your data “rots” much like it does on magnetic media. No one is quite sure what the retention truly is. NOR flash is better for endurance than NAND, but if it’s a recent device with more than about 32MB of storage, it’ll likely be NAND.

PROM

I did consider whether PROMs could be used for this, the idea being you’d work out what you wanted to store, burn a PROM with the data as ISO9660, then package it up with a small MCU that presents it as CD-ROM. The concept could work since it worked great for game consoles from the 80s. In practice they don’t make PROMs big enough. Best I can do is about 1 floppy’s worth: maybe 8 seconds of audio.

Hard drives

HDDs are an option, and for now that’s half my present interim solution. I have a 1TB drive formatted UDF which I store my downloads on. The drive is one of the old object storage drives from the server cluster after I upgraded to 2TB drives. So not a long-term solution. I am presently also recovering data from an old 500GB drive (PATA!), and observing what age does to these disks when they’re not exercised frequently. In short, I can’t rely on this alone.

CDs, DVDs and Bluray

So, we’re back to optical media. All three of these are available as blank record-able media, and even Bluray drives can read CDs. (Unlike LTO: where an LTO-$X drive might be backward compatible with LTO-$(X-2) but no further.)

There are blanks out there that are designed for archival use, notably the M-Disc DVD media, are allegedly capable of lasting 1000 years.

I don’t plan to wait that long to see if their claims stack up.

All of these formats use the same file systems normally, either ISO-9660 or UDF. Neither of these file systems offer any kind of forward error correction of data, so if the dye fades, or the disc gets scratched, you can potentially lose data.

Right now, my other mechanism, is to use CDs and DVDs, burned with the same material I put on the aforementioned 1TB HDD. The optical media is formatted ISO-9660 with Joliet and Rock-Ridge extensions. It works for now, but I know from hard experience that CD-Rs and DVD-Rs aren’t forever. Question is, can they be improved?

File system thoughts

Obviously genuinely better quality media will help in this archiving endeavour, but the thought is can I improve the odds? Can I sacrifice some storage capacity to achieve data resilience?

Audio CDs, as I mentioned, use Reed-Solomon encoding. Specifically, Cross-Interleaved Reed-Solomon encoding. ISO-9660 is a file system that supports extensions on the base standard.

I mentioned two before, Rock-Ridge and Joliet. On top of Rock-Ridge, there’s also zisofs, which adds transparent decompression to a Rock-Ridge file system. What if, I could make a copy of each file’s blocks that were RS-encoded, and placed them around the disc surface so that if the original file was unreadable, we could turn to the forward-error corrected copy?

There is some precedent in such a proposal. In Usenet, the “parchive” format was popularised as a way of adding FEC to files distributed on Usenet. That at least has the concept of what I’m wishing to achieve.

The other area of research is how can I make the ISO-9660 filesystem metadata more resilient. No good the files surviving if the filesystem metadata that records where they are is dead.

Video DVD are often dual UDF/ISO-9660 file systems, the so-called “UDF Bridge” format. Thus, it must be possible for a foreign file system to live amongst the blocks of an ISO-9660 file system. Conceptually, if we could take a copy of the ISO-9660 filesystem metadata, FEC-encode those blocks, and map them around the drive, we can make the file system resilient too.

FEC algorithms are another consideration. RS is a tempting prospect for two reasons:

zfec used in Tahoe-LAFS is another option, as is Golay, and many others. They’ll need to be assessed on their merits.

Anyway, there are some ideas… I’ll ponder further details in another post.

My position on China

The last few years have been a testing time for world politics. Recent events have seen much sabre-rattling, but really, none of this has suddenly “appeared”… it’s been slowly bubbling away for some time now.

Economic tunnel-vision

For a long time now, much of our world has revolved around the unit of currency. Call it the US dollar, the Australian dollar, the British Pound, Chinese Yuan, whatever… for the past 50 years or so, we have been “seduced” by two concepts which developed in the latter part of last century:

  • economies of scale
  • just-in-time production

The concepts are on the surface, fairly simple.

Just-in-time production forgoes having a large stock and inventory of components to feed your supply-lines in favour of ordering just enough of what you need to fulfil the orders you have active at the present moment. So long as nothing disrupts your supply lines, all is rosy. You might keep a small inventory just as a buffer, but in general, that might only last a day or so.

Economies of Scale was the other concept that really took hold last century, and was the reason why smaller workshops got shut down in favour of making lots of a widget in one central place, and shipping it out to everywhere from that one point.

Again, works great, until something happens in that place where you are doing the manufacturing, or something happens that hampers your ability to shift parts or product around.

The latter in particular took a dark turn when instead of making things close to where the demand was, “we” instead outsourced it, shifting the production to places where the labour was cheapest. As a consequence, many countries are forced to import as they no longer have the expertise or capabilities to manufacture products locally.

Both these concepts were ideas conceived with people wearing rose-coloured glasses, they emphasise cost-cutting over contingency plans on the grounds that disruption to manufacturing and supplies are unlikely events.

The rise of “the world’s factory”

Over time, companies pushed this concept of centralised manufacturing to extremes, whereby they were largely making things in one place. Apple for instance, were leaning heavily on Foxconn in China for the manufacture of their hardware.

None of this is without precedent, when I was growing up, Nike used to cop a lot of flack for the exploitation of workers in various third-world localities.

That said, history has often had something to say about putting all of one’s eggs in a single basket. There’s mostly nothing wrong with having products made in China, the problem is having things made exclusively in China.

At first, products made in China were seen as dodgy knock-offs of things made elsewhere. The same was said of things made in Japan in the 1950s and 1960s… but then Japan improved their systems and processes, and with it, the products they made improved too. In the case of China, initially things were done “cheaply”, which gave rise to a perception that things made in China were all “dodgy”.

Over time, processes again improved, and now there are some great examples of products and services, which are designed and built by people based in China. Stuff that works, and is reliable. There are some very smart people over there who are great at their craft.

That said, manufacturing all revolves around the dollar, and so when it came to cutting costs, something had to give.

Trouble in Xinjiang

With this global demand for manufacturing, China had a problem trying to find people to do the mundane jobs. Quality had to be maintained, and so some organisations over there tried to solve the cost problem a different way: cheaper labour.

Now, it’s well known that China’s government is not a government that particularly values individualism. This is evident in the manner in which the Tienanmen Square protests were so violently silenced.

The Uighur Muslim community is one such group that has been in their sights for a long time. This is a group that has been clamped-down on for more than 6 years. Over time, a narrative was developed that tried to cast this group as being “trouble makers” in need of “re-education”.

Over time, members of this community found themselves co-opted into being the cogs in this “global” factory. At first, such actions were hidden from view, including from the direct customers of these factories.

COVID-19 makes its entrance

So, over time, global manufacturing has shifted to China, in some cases involving forced labour in the effort to drive the cost down and make the end product seem more competitive.

Much of these problems have been hidden from the outside world, but for now, whilst we’re starting to learn of these issues, we still do the majority of our manufacturing in one country.

Then, about this time last year, a bizarre respiratory condition started showing up in Wuhan. Nobody knew much about this condition, other than the fact that it was discovered it was highly contagious.

Even today, we’re still unsure exactly how it came about, but the smart money is that it jumped from some reservoir host such as a bat, via some intermediate host, to humans. Bats in particular are major carriers of all kinds of corona-viruses, and as such, are a highly probably suspect in this.

I do not believe it is synthetic in origin.

COVID-19 threw a major spanner in the works for everybody. Community event calendars looked like an utter train-wreck with cancellations and deferrals all over the place. For me, some of the casualties I was looking forward to include the 2020 Yarraman to Wulkuraka bike ride and numerous endurance horse-riding events (where I assist in operations).

It also threw a major spanner in the works for just-in-time manufacturing (since freight was running inefficiently due to a lack of flights) and rolling shut-downs across China as COVID-19 did its worst.

Some businesses have already closed for good.

Knee-jerk reactions

Numerous countries, notably ours, called for an investigation into the origins and initial handling of the COVID-19 pandemic.

I for one, think such an investigation should go ahead. We owe it to the people who have lost their lives, and those who have lost their livelihoods, to this condition, that we try and find out what went wrong. It’s not about blaming people.

We’re not interested in who made the mistakes, it’s more a question of what the mistakes were. This event will repeat itself again, and again, until such time as we get to understand what “we” (globally) did wrong.

China’s government does not seem to have seen it this way. It’s as if they see it as a witch-hunt. As a result, we as a nation that seems to have been singled-out, with heavy tariffs placed on goods that we as a nation export to China.

Notably absent in this trade-war is iron ore, partially because the other major producer of iron ore, Brazil, has been left a complete basket-case by this pandemic, and Australia was a major supplier of iron ore long before COVID-19 reared its ugly head.

A plan “B”

Right now, things are escalating in this diplomatic row. Whilst the politicians are trying to resolve this with as little fuss as possible, I think China’s position is becoming very clear. They’ve told the world “F You” in no uncertain terms.

We are most definitely dealing with a rebellious and violent teenager, more than capable of smashing holes in a few walls and inflicting grievous bodily harm.

I think it would be wonderful if things could be reset back to the way they were, but at the same time, I think that really, we may need to realise that “peak China” days may be behind us now.

I know there are organisations that have built their entire business model around exports to China, and that literally overnight, conditions have changed which now make that greatly risk business viability.

They are geared around the huge appetite that this country’s people have previously demonstrated for our goods and services. I think now, more than ever, we should be looking around. Where else can I outsource to? Where else can I sell to? How can we make do with less demand?

If China does come around, then sure, maybe a certain portion of your market can be serviced there. I think it folly though to be reliant on one single region for your supply or demand though.

Two or three alternatives may not totally balance things, but having at least a partial income is better than none at all!

The Australian coat-of-arms features the emu and the kangaroo. These animals are quite different from one another, but they share a few common attributes. Yes, some might say they’re two of the less brainy members of the animal kingdom, but also, they are not known for going “backwards”.

Whilst we momentarily look over our shoulder at our past, I think it important that we keep moving “forwards”.

Learning from our mistakes

I think in all of this, it’s fair to say none of us are perfect. Yes, our SAS troops have been implicated in some truly horrendous war crimes. Not all of them, thankfully, but enough to cast a cloud over the military in general. Some of the Army’s chopper pilots are not exactly famous for fast reporting of fires either.

We’re investigating this, and yes, some of the top brass are ducking for cover, as it’s likely some know more than they’ve been letting on. An analysis of what went wrong will be done, and we, collectively, will learn from those mistakes.

In the case of COVID-19, for the first few months of 2020, we were told “No, we don’t need help, we’re fine, we’ve got this!”. Taiwan saw this, and immediately sprang to action, as did many other nations close to China. They’ve seen similar things happen before (SARS, MERS), and so maybe their scepticism shielded them somewhat.

I think one of the biggest lessons of all is to realise that asking for help is not a sign of weakness, it’s a sign of maturity. We’re on this planet, together. We are in this mess, together. We need to work this all out, together.

What am I doing?

So, based on the above… where do I sit? Not on the fence.

I myself have started seriously considering my suppliers.

In particular, I have practically destroyed my credentials for AliExpress, having bought the last few things I’m likely to want from there. I’ve ordered printed circuit boards from a supplier in Hong Kong.

During last year, I had ordered a few PCBs from their sister factory in mainland China as I was concerned about the civil unrest there (and on that, I do think the people there have a valid point to raise) causing delays, but had originally intended to move things back once things settled down. However, with China being so adamant that Hong Kong is “theirs”, I’m forced to treat Hong Kong the same as mainland China.

As such, I’ll probably be looking to the US, Europe or India to evaluate options there. I might still use the old Hong Kong supplier, but they won’t be the sole supplier.

Where possible, I’ll probably be paying more attention to country-of-origin for products I buy from now on, and preferring local options where possible. This won’t always be the case, and some things will have to be imported from China, but I aim to diversify my sources.

I may start making things myself. Yes, time-consuming, expensive, but ultimately, this means I become the master of my own destiny, it’s likely a worthwhile journey to undertake.

Above all, I am not out to discriminate against the people of China. I may not always agree with some of their customs, but that does not give one the right to indulge in racism. My only real complaint with China at this time, is the conduct of its government.

Maybe with time, diplomatic relations might turn this around, and we may see a more co-operative Chinese government, only time will tell on that.

In the meantime, I plan to not reward their government for what I consider, bad behaviour.