Oct 222017

So I’ve now had the solar panels up for a month now… and so far, we’ve had a run of very overcast or wet days.

Figures… and we thought this was the “sunshine state”?

I still haven’t done the automatic switching, so right now the mains power supply powers the relay that switches solar to mains.  Thus the only time my cluster runs from solar is when either I switch off the mains power supply manually, or if there’s a power interruption.

The latter has not yet happened… mains electricity supply here is pretty good in this part of Brisbane, the only time I recall losing it for an extended period of time was back in 2008, and that was pretty exceptional circumstances that caused it.

That said, the political football of energy costs is being kicked around, and you can bet they’ll screw something up, even if for now we are better off this side of the Tweed river.

A few weeks back, with predictions of a sunny day, I tried switching off the mains PSU in the early morning and letting the system run off the solar.  I don’t have any battery voltage logging or current logging as yet, but the system went fine during the day.  That evening, I turned the mains back on… but the charger, a Redarc BCDC1225, seemingly didn’t get that memo.  It merrily let both batteries drain out completely.

The IPMI BMCs complained bitterly about the sinking 12V rail at about 2AM when I was sound asleep.  Luckily, I was due to get up at 4AM that day.  When I tried checking a few things on the Internet, I first noticed I didn’t have a link to the Internet.  Look up at the switch in my room and saw the link LED for the cluster was out.

At that point, some choice words were quietly muttered, and I wandered downstairs with multimeter in hand to investigate.  The batteries had been drained to 4.5V!!!

I immediately performed some load-shedding (ripped out all the nodes’ power leads) and power-cycled the mains PSU.  That woke the charger up from its slumber, and after about 30 seconds, there was enough power to bring the two Ethernet switches in the rack online.  I let the voltage rise a little more, then gradually started re-connecting power to the nodes, each one coming up as it was plugged in.

The virtual machine instances I had running outside OpenNebula came up just fine without any interaction from me, but  it seems OpenNebula didn’t see it fit to re-start the VMs it was responsible for.  Not sure if that is a misconfiguration, or if I need to look at an alternate solution.

Truth be told, I’m not a fan of libvirt either… overly complicated for starting QEMU VMs.  I might DIY a solution here as there’s lots of things that QEMU can do which libvirt ignores or makes more difficult than it should be.

Anyway… since that fateful night, I have on two occasions run the cluster from solar without incident.  On the off-chance though, I have an alternate charger which I might install at some point.  The downside is it doesn’t boost the 12V input like the other one, so I’d be back to using that Xantrex charger to charge from mains power.

Already, I’m thinking about the criteria for selecting a power source.  It would appear there are a few approaches I can take, I can either purely look at the voltages seen at the solar input and on the battery, or I can look at current flow.

Voltage wise, I tried measuring the solar panel output whilst running the cluster today.  In broad daylight, I get 19V off the panels, and at dusk it’s about 16V.

Judging from that, having the solar “turn on” at 18V and “turn off” at 15V seems logical.  Using the comparator approach, I’d need to set a reference of 16.5V and tweak the hysteresis to give me a ±3V swing.

However, this ignores how much energy is actually being produced from solar in relation to how much is being consumed.  It is possible for a day to start off sunny, then for the weather to cloud over.  Solar voltage in that case might be sitting at the 16V mentioned.

If the current is too low though, the cluster will drain more power out than is going in, and this will result in the exact conditions I had a few weeks ago: a flat battery bank.  Thus I’m thinking of incorporating current shunts both on the “input” to the battery bank, and to the “output”.  If output is greater than input, we need mains power.

There’s plenty of literature about interfacing to current shunts.  I’ll have to do some research, but immediately I’m thinking an op-amp running from the battery configured as a non-inverting DC gain block with the inputs going to either side of the current shunt.

Combining the approaches is attractive.  So turn on when solar exceeds 18V, turn off when battery output current exceeds battery input current.  A dual op-amp, a dual comparator, two current shunts, a R-S flip-flop and a P-MOSFET for switching the relay, and no hysteresis calculations needed.

Aug 202017

OpenNebula is running now… I ended up re-loading my VM with Ubuntu Linux and throwing OpenNebula on that.  That works… and I can debug the issue with Gentoo later.

I still have to figure out corosync/heartbeat for two VMs, the one running OpenNebula, and the core router.  For now, the VMs are only set up to run on one node, but I can configure them on the other too… it’s then a matter of configuring libvirt to not start the instances at boot, and setting up the Linux-HA tools to figure out which node gets to fire up which VM.

The VM hosts are still running Gentoo however, and so far I’ve managed to get them to behave with OpenNebula.  A big part was disabling the authentication in libvirt, otherwise polkit generally made a mess of things from OpenNebula’s point of view.

That, and firewalld had to be told to open up ports for VNC/spice… I allocated 5900-6900… I doubt I’ll have that many VMs.

Last weekend I replaced the border router… previously this was a function of my aging web server, but now I have an ex-RAAF-base Advantech UNO-1150G industrial PC which is performing the routing function.  I tried to set it up with Gentoo, and while it worked, I found it wasn’t particularly stable due to limited memory (it only has 256MB RAM).  In the end, I managed to get OpenBSD 6.1/i386 running sweetly, so for now, it’s staying that way.

While the AMD Geode LX800 is no speed demon, a nice feature of this machine is it’s happy with any voltage between 9 and 32V.

The border router was also given the responsibility of managing the domain: I did this by installing ISC BIND9 from ports and copying across the config from Linux.  This seemed to be working, and so I left it.  Big mistake, turns out bind9 didn’t think it was authoritative, and so refused to handle AXFRs with my slaves.

I was using two different slave DNS providers, puck.nether.net and Roller Network, both at the time of subscription being freebies.  Turns out, when your DNS goes offline, puck.nether.net responds by disabling your domain then emailing you about it.  I received that email Friday morning… and so I wound up in a mad rush trying to figure out why BIND9 didn’t consider itself authoritative.

Since I was in a rush, I decided to tell the border router to just port-forward to the old server, which got things going until I could look into it properly.  It took a bit of tinkering with pf.conf, but eventually got that going, and the crisis was averted.  Re-enabling the domains on puck.nether.net worked, and they stayed enabled.

It was at that time I discovered that Roller Network had decided to make their slave DNS a paid offering.  Fair enough, these things do cost money… At first I thought, well, I’ll just pay for an account with them, until I realised their personal plans were US$5/month.  My workplace uses Vultr for hosting instances of their WideSky platform for customers… and aside from the odd hiccup, they’ve been fine.  US$5/month VPS which can run almost anything trumps US$5/month that only does secondary DNS, so out came the debit card for a new instance in their Sydney data centre.

Later I might use it to act as a caching front-end and as a secondary mail exchanger… but for now, it’s a DIY secondary DNS.  I used their ISO library to install an OpenBSD 6.1 server, and managed to nut out nsd to act as a secondary name server.

Getting that going this morning, I was able to figure out my DNS woes on the border router and got that running, so after removing the port forward entries, I was able to trigger my secondary DNS at Vultr to re-transfer the domain and debug it until I got it working.

With most of the physical stuff worked out, it was time to turn my attention to getting virtual instances working.  Up until now, everything running on the VM was through hand-crafted VMs using libvirt directly.  This is painful and tedious… but for whatever reason, OpenNebula was not successfully deploying VMs.  It’d get part way, then barf trying to set up 802.1Q network interfaces.

In the end, I knew OpenNebula worked fine with bridges that were already defined… but I didn’t want to have to hand-configure each VLAN… so I turned to another automation tool in my toolkit… Ansible:

- hosts: compute
  - name: Configure networking
    template: src=compute-net.j2 dest=/etc/conf.d/net
# …
- hosts: compute
# …
  - name: Add symbolic links (instance VLAN interfaces)
    file: src=net.lo dest=/etc/init.d/net.bond0.{{item}} state=link
    with_sequence: start=128 end=193
  - name: Add symbolic links (instance VLAN bridges)
    file: src=net.lo dest=/etc/init.d/net.vlan{{item}} state=link
    with_sequence: start=128 end=193
# …
  - name: Make services start at boot (instance VLAN bridges)
    command: rc-update add net.vlan{{item}} default
    with_sequence: start=128 end=193 

That’s a snippet of the playbook… and it basically creates symbolic links from Gentoo’s net.lo for all the VLAN ports and bridges, then sets them up to start at boot.

In the compute-net.j2 file referenced above, I put in the following to enumerate all the configuration bits.

# Instance VLANs
{% for vlan in range(128,193) %}
{% endfor %}
# …
vlans_bond0="5 8 10{% for vlan in range(128,193) %} {{vlan}} {% endfor %}248 249 250 251 252"
# …
# Instance VLANs
{% for vlan in range(128,193) %}
{% endfor %} 

The start and end ranges are a little off, but it saved a lot of work.

This naturally took a while for OpenRC to bring up… but it worked. Going back to OpenNebula, I told it what bridges to use, and before long I had my first instance… an OpenBSD router to link my personal VLAN to the DMZ.

I spent a bit of time re-working my routing tables after that… in fact, my network is getting big enough now I have to write some details down.  I spent a few hours documenting the effort:

That’s page 1 of about 15… yes my hand is sore… but at least now should I get run over by a bus, others have a fighting chance doing anything with the network without my technical input.

Jul 292017

So, I had a go at getting OpenNebula actually running on my little VM.  Earlier I had installed it to the /opt directory by hand, and today, I tried launching it from that directory.

To get the initial user set up, you have to create the ${ONE_LOCATION}/.one/one_auth (in my case; /opt/opennebula/.one/one_auth) with the username and password for your initial user separated by a colon.  The idea here is that is used to initially create the user, you then change the password once you’re successfully logged in.

That got me a little further, but then it still fails… turns out it doesn’t like some options specified by default in the configuration file.   I commented out some options, and that got me a little further again.  oned started, but then went into lala land, accepting connections but then refusing to answer queries from other tools, leaving them to time out.

I’ve since managed to package OpenNebula into a Gentoo Ebuild, which I have now published in a dedicated overlay.  I was able to automate a lot of the install process this way, but I was still no closer.

On a hunch, I tried installing the same ebuild on my laptop.  Bingo… in mere moments, I was staring at the OpenNebula Sunstone UI in my browser, it was working.  The difference?  My laptop is running Gentoo with the standard glibc C library, not musl.  OpenNebula compiled just fine on musl, but perhaps differences in how musl does threads or some such (musl takes a hard-line POSIX approach) is causing a deadlock.

So, I’m rebuilding the VM using glibc now.  We shall see where that gets us.  At least now I have the install process automated. 🙂

Jul 232017

So, the front-end for OpenNebula will be a VM, that migrates between the two compute nodes in a HA arrangement.  Likewise with the core router, and border router, although I am also tossing up trying again with the little Advantech UNO-1150G I have laying around.

For now, I’ve not yet set up the HA part, I’ll come to that.  There are guides for using libvirt with corosync/heartbeat, most also call up DR:BD as the block device for the VM, but we will not be using this as our block device (Rados Block Device) is already redundant.

To host OpenNebula, I’ll use Gentoo with musl-libc since that’ll shrink the footprint down just a little bit.  We’ll run it on a MariaDB back-end.

Since we’re using musl, you’ll want to install layman and the musl overlay as not all packages build against musl out-of-the-box.  Also install gentoolkit, as you’ll need to set USE flags, and euse makes this easy:

# emerge layman
# layman -L
# layman -a musl
# emerge gentoolkit

Now that some basic packages are installed, we need to install OpenNebula’s prerequisites. They tell you in amongst these is xmlrpc-c. BUT, they don’t tell you that it needs support for abyss: and the scons build system they use will just give you a cryptic error saying it couldn’t find xmlrpc. The answer is not, as suggested, to specify the path to xmlrpc-c-config, which happens to be in ${PATH} anyway, as that will net the same result, and break things later when you fix the real issue.

# euse -p dev-util/xmlrpc-c -E abyss

Now we can build the dependencies… this isn’t a full list, but includes everything that Gentoo ships in repositories, the remaining Ruby gems will have to be installed separately.

# emerge --ask dev-lang/ruby dev-db/sqlite dev-db/mariadb \
dev-ruby/sqlite3 dev-libs/xmlrpc-c dev-util/scons \
dev-ruby/json dev-ruby/sinatra dev-ruby/uuidtools \
dev-ruby/curb dev-ruby/nokogiri

With that done, create a user account for OpenNebula:

# useradd -d /opt/opennebula -m -r opennebula

Now you’re set to build OpenNebula itself:

# tar -xzvf opennebula-5.4.0.tar.gz
# cd opennebula-5.4.0
# scons mysql=yes

That’ll run for a bit, but should succeed. At the end:

# ./install -d /opt/opennebula -u opennebula -g opennebula

There’s about where I’m at now… the link in the README for further documentation is a broken link, here is where they keep their current documentation.

Jul 232017

So, having got some instances going… I thought I better sort out the networking issues proper.  While it was working, I wanted to do a few things:

  1. Bring a dedicated link down from my room into the rack directly for redundancy
  2. Define some more VLANs
  3. Sort out the intermittent faults being reported by Ceph

I decided to tackle (1) first.  I have two 8-port Cisco SG-200 switches linked via a length of Cat5E that snakes its way from our study, through the ceiling cavity then comes up through a small hole in the floor of my room, near where two brush-tail possums call home.

I drilled a new hole next to where the existing cable entered, then came the fun of trying to feed the new cable along side the old one.  First attempt had the cable nearly coil itself just inside the cavity.  I tried to make a tool to grab the end of it, but it was well and truly out of reach.  I ended up getting the job done by taping the cable to a section of fibreglass tubing, feeding that in, taping another section of tubing to that, feed that in, etc… but then I ran out of tubing.

Luckily, a rummage around, and I found some rigid plastic that I was able to tape to the tubing, and that got me within a half-metre of my target.  Brilliant, except I forgot to put a leader cable through for next time didn’t I?

So more rummaging around for a length of suitable nylon rope, tape the rope to the Cat5E, haul the Cat5E out, then grab another length of rope and tape that to the end and use the nylon rope to haul everything back in.

The rope should be handy for when I come to install the solar panels.

I had one 16-way patch panel, so wound up terminating the rack-end with that, and just putting a RJ-45 on the end in my room and plugging that directly into the switch.  So on the shopping list will be some RJ-45 wall jacks.

The cable tester tells me I possibly have brown and white-brown switched, but never mind, I’ll be re-terminating it properly when I get the parts, and that pair isn’t used anyway.

The upshot: I now have a nice 1Gbps ring loop between the two SG-200s and the LGS326 in the rack.  No animals were harmed in the formation of this ring, although two possums were mildly inconvenienced.  (I call that payback for the times they’ve held the Marsupial Olympics at 2AM when I’m trying to sleep!)

Having gotten the physical layer sorted out, I was able to introduce the upstairs SG-200 to the new switch, then remove the single-port LAG I had defined on the downstairs SG-200.  A bit more tinkering going, and I had a nice redundant set-up: setting my laptop to ping one of the instances in the cluster over WiFi, I could unplug my upstairs trunk, wait a few seconds, plug it back in, wait some more, unplug the downstairs trunk, wait some more again, then plug in back in again, and not lose a single ICMP packet.

I moved my two switches and my AP over to the new management VLAN I had set up, along side the IPMI interfaces on the nodes.  The SG-200s were easy, aside from them insisting on one port being configured with a PVID equal to the management VLAN (I guess they want to ensure you don’t get locked out), it all went smoothly.

The AP though, a Cisco WAP4410N… not so easy.  In their wisdom, and unlike the SG-200s, the management VLAN settings page is separate from the IP interface page, so you can’t change both at the same time.  I wound up changing the VLAN, only to find I had locked myself out of it.  Much swearing at the cantankerous AP and wondering how could someone overlook such a fundamental requirement!  That, and the switch where the AP plugs in, helpfully didn’t add the management VLAN to the right port like I asked of it.

Once that was sorted out, I was able to configure an IP on the old subnet and move the AP across.

That just left dealing with the intermittent issues with Ceph.  My original intention with the cluster was to use 802.3AD so each node had two 2Gbps links.  Except: the LGS326-AU only supports 4 LAGs.  For me to do this, I need 10!

Thankfully, the bonding support in the Linux kernel has several other options available.  Switching from 802.3ad to balance-tlb, resolved the issue.

slaves_bond0="enp0s20f0 enp0s20f1"
slaves_bond1="enp0s20f2 enp0s20f3"
rc_net_bond0_need="net.enp0s20f0 net.enp0s20f1"
rc_net_bond1_need="net.enp0s20f2 net.enp0s20f3"

I am now currently setting up a core router instance (with OpenBSD 6.1) and a OpenNebula instance (with Gentoo AMD64/musl libc).

Jul 062017

So, since my last log, I’ve managed to tidy up the wiring on the cluster, making use of the plywood panel at the back to mount all my DC power electronics, and generally tidying everything up.

I had planned to use a SB50 connector to connect the cluster up to the power supply, so made provisions for this in the wiring harness. Turns out, this was not necessary, it was easier in the end to just pull apart the existing wiring and hard-wire the cluster up to the charger input.

So, I’ve now got a spare load socket hanging out the front, which will be handy if we wind up with unreliable mains power in the near future since it’s a convenient point to hook up 12V appliances.

There’s a solar power input there ready, and space to the left of that to build a little control circuit that monitors the solar voltage and switches in the mains if needed. For now though, the switching is done with a relay that’s hard-wired on.

Today though, I managed to get the Ceph clients set up on the two compute nodes, and while virt-manager is buggy where it comes to RBD pools. In particular, adding a RBD storage pool doesn’t work as there’s no way to define authentication keys, and even if you have the pool defined, you find that trying to use images from that pool causes virt-manager to complain it can’t find the image on your local machine. (Well duh! This is a known issue.)

I was able to find a XML cheat-sheet for defining a domain in libvirt, which I was then able to use with Ceph’s documentation.

A typical instance looks like this:

<domain type='kvm'>
  <!-- name of your instance -->
  <!-- a UUID for your instance, use `uuidgen` to generate one -->
    <type arch="x86_64">hvm</type>
  <clock sync="utc"/>
    <disk type='network' device='disk'>
      <source protocol='rbd' name="poolname/image.vda">
        <!-- the hostnames or IPs of your Ceph monitor nodes -->
        <host name="s0.internal.network" />
        <host name="s1.internal.network" />
        <host name="s2.internal.network" />
      <target dev='vda'/>
      <auth username='libvirt'>
        <!-- the UUID here is what libvirt allocated when you did
	    `virsh secret-define foo.xml`, use `virsh secret-list`
	    if you've forgotten what that is. -->
        <secret type='ceph' uuid='23daf9f8-1e80-4e6d-97b6-7916aeb7cc62'/>
    <disk type='network' device='cdrom'>
      <source protocol='rbd' name="poolname/image.iso">
        <!-- the hostnames or IPs of your Ceph monitor nodes -->
        <host name="s0.internal.network" />
        <host name="s1.internal.network" />
        <host name="s2.internal.network" />
      <target dev='hdd'/>
      <auth username='libvirt'>
        <secret type='ceph' uuid='23daf9f8-1e80-4e6d-97b6-7916aeb7cc62'/>
    <interface type='network'>
      <source network='default'/>
      <mac address='11:22:33:44:55:66'/>
    <graphics type='vnc' port='-1' keymap='en-us'/>

Having defined the domain, you can then edit it at will in virt-manager. I was able to switch the network interface over to using virtio, plop it on a bridge so it was wired up to the correct VLAN and start the instance up.

I’ve since managed to migrate 3 instances over, namely an estate database, Brisbane Area WICEN’s OwnCloud site, and my own blog.

These are sufficient to try the system out. I’m already finding these instances much more responsive, using raw Ceph even, than the original server.

My next move I think will be to see if I can get corosync/heartbeat to manage a HA VM instance. That is, if one of the compute nodes goes offline, the instance restarts on the other compute node.

Two services come to mind where HA is concerned: terminating the PPPoE link for our Internet, and a virtual management node for a higher-level system such as OpenNebula. OpenNebula really needs something semi-HA, since it really gets its knickers in a twist if the master node goes down. I also want my border router to be HA, since I won’t necessarily be around to migrate it to a different node.

Everything else, well I suspect OpenNebula can itself manage those, and long term the instances I just liberated today from my old box, will become instances within OpenNebula.

The other option is I dip my toe into OpenStack (again), since it is inherently HA by design, but it is also a royal pain to get working.

Jun 292017

So, there’s some work still to be done, for example making some extension leads for the run between the battery link harness, load power distribution and the charger… and to generally tidy things up, but it is now up and running.

On the floor, is the 240V-12V power supply and the charger, which right now is hard-wired in boost mode. In the bottom of the rack are the two 105Ah 12V AGM batteries, in boxes with fuses and isolation switches.

The nodes and switching is inside the rack, and resting on top is the load power distribution board, which I’ll have to rewire to make things a little neater. A prospect is to mount some of this on the back.

I had a few introductions to make, introducing the existing pair of SG-200 switches to the newcomer and its VLANs, but now at least, I’m able to SSH into the nodes, access the IPMI BMC and generally configure the whole box and dice.

With the exception of the later upgrade to solar, and the aforementioned wiring harness clean-ups, the hardware-side of this dual hardware/software project, is largely complete, and this project now transitions to being a software project.

The plan from here:

  • Update the OSes… as all will be a little dated. (I might even blow away and re-load.)
  • Get Ceph storage up and running. It actually should be configured already, just a matter of getting DNS hostnames sorted out so they can find eachother.
  • Investigating the block caching landscape: when I first started the project at work, it was a 3-horse race between Facebook’s FlashCache, bcache and dmcache. Well, FlashCache is no more, replaced by EnhancedIO, and I’m not sure about the rest of the market. So this needs researching.
  • Management interfaces: at my workplace I tried Ganeti, OpenNebula and OpenStack. This again, needs re-visiting. OpenNebula has moved a long way from where it was and I haven’t looked at the others in a while. OpenStack had me running away screaming, but maybe things have improved.
May 072017

So, in amongst my pile of crusty old hardware is the old netbook I used to use in the latter part of my univerity days. It is a Lemote Yeeloong, and sports a ~700MHz Loongson 2F CPU (MIPS III little endian ISA) and 1GB RAM.

Back in the day it was a brilliant little machine. It came out of the box running a localised (for China) version of Debian, and had pretty much everything you’d need. I natually repartitioned the machine, setting up Gentoo and I had a separate partition for Debian, so I could actually dual-boot between them.

Fast forward 10 years, the machine runs, but the battery is dead, and Debian no longer supports MIPS-III machines. Debian Jessie does, but Stretch, likely due for release some time this year, will not, if you haven’t got a CPU that supports mips32r2 or mips64r2, you’re stuffed.

I don’t want to throw this machine away.  Being as esoteric as it is, it is an unlikely target for theft, as to the casual observer, it’ll just be “some crappy netbook”.  If someone were to try and steal it, there’s a very high probability I’ll recover it with my data because the day its PMON2000 boot firmware successfully boots a x86-64 OS like Ubuntu or Windows without the assistance of a VM of some kind would be the day Satan puts a requisition order in for anti-freeze and winter mittens.

My use case is for a machine I can take with me on the bicycle.  My needs aren’t huge: I won’t be playing video on this thing, it’ll be largely for web browsing and email.  The web browser needs to support JavaScript, so that rules out options like ELinks or Dillo, my preferred browser is Firefox but I’ll settle for something Webkit-based if that’s all that’s out there.

So what operating systems do I have for a machine that sports a MIPS-III CPU and 1GB RAM?  Fedora has a MIPS port, but that, like Debian, is for the newer MIPS systems.  Arch Linux too is for newer architectures.

I could bootstrap Alpine Linux… and maybe that’s worth looking into, they seem to be doing some nice work in producing a small and capable Linux distribution.  They don’t yet support MIPS though.

Linux From Scratch is an option, if a little labour intensive.  (Been there, done that.)

OpenBSD directly supports this machine, and so I gave OpenBSD 6.0 a try.  It’s a very capable OS, and while it isn’t Linux, there isn’t much that an experienced Linux user like myself needs to adapt to in order to effectively use the OS.  pkgsrc is a great asset to OpenBSD, with a large selection of pre-built packages already available.  Using that, it is possible to get a workable environment up and running very quickly.  OpenBSD/loongson uses the n64 ABI.

Due to licensing worries, they use a particularly old version of binutils as their linker and assembler.  The plan seems to be they wish to wean themselves off the GNU toolchain in favour of LLVM.  At this time though, much of the system is built using the GNU toolchain with some custom patches.  I found that, on the Yeeloong, 1GB RAM was not sufficient for compiling LLVM, even after adding additional swap files, and some packages I needed weren’t available in pkgsrc, nor would they build with the version of GNU tools available.

Maybe as they iron out the kinks in their build environment with LLVM, this will be worth re-visiting.  They’ve done a nice job so far, but it’s not quite up to where I need it to be.

Gentoo actually gives me the choice of two possible ABIs: o32 and n32o32 is the old 32-bit ABI, and suffers a number of performance problems, but generally works.  It’s what Debian Jessie and earlier supplies, and what their mips32 port will produce from Stretch onwards.

n32 is the MIPS equivalent of what some of you may know as x32 on AMD64 platforms, it is a 32-bit environment with 64-bit long pointers… the idea being that very few applications actually benefit from the use of 64-bit data types, and so the usual quantities like int and long remain the same as what they’d be on o32, saving memory.  The long long data type gets a boost because, although “32-bit”, the 64-bit operations are still available for use.

The trouble is, some applications have problems with this mode.  Either the code sees “mips64” in the CHOST and assumes a full 64-bit system (aka n64), or it assumes the pointers are the same width as a long, or the build system makes silly assumptions as to where things get put.  (virtualenv comes to mind, which is what started me on this journey.  The same problem affects x32 on AMD64.)

So I thought, I’d give n64 a try.  I’d see if I can build a cross-compiler on my AMD64 host, and bootstrap Gentoo from that.

Step 1: Cross-compiler

For the cross-compiler, Gentoo has a killer feature that I have not seen in too many other distributions: crossdev.  This is a toolchain build tool that can generate cross-compiler toolchains for most processor architectures and environments.

This is installed by running emerge sys-devel/crossdev.

A gotcha with hardened

I run “hardened” AMD64 stages on my machines, and there’s a little gotcha to be aware of: the hardened USE flag gets set by crossdev, and that can cause fun and games if, like on MIPS, the hardening features haven’t been ported.  My first attempt at this produced a n64 userland where pretty much everything generated a segmentation fault, the one exception being Python 2.7.  If I booted with init=/bin/bash (or init=/bin/bb), my virtual environment died, if I booted with init=/usr/bin/python2.7, I’d be dropped straight into a Python shell, where I could import the subprocess module and try to run things.

Cleaning up, and forcing crossdev to leave off hardened support, got things working.

Building the toolchain

With the above gotcha in mind:

# crossdev --abis n64 \
           --env 'USE="-hardened"' \
           -s4 -t mips64el-unknown-linux-gnu

The --abis n64 tells crossdev you want a n64 ABI toolchain, and the --env will hopefully keep the hardened flag unset. Failing that, try this:

# cat > /etc/portage/package.use/mips64 <<EOF
cross-mips64el-unknown-linux-gnu/binutils -hardened
cross-mips64el-unknown-linux-gnu/gcc -hardened
cross-mips64el-unknown-linux-gnu/glibc -hardened

If you want a combination of specific toolchain components to try, I’m using:

  • Binutils: 2.28
  • GCC: 5.4.0-r3
  • glibc: 2.25
  • headers: 4.10

Step 2: Checking our toolchain

This is where I went wrong the first time, I tried building the entire OS, only to discover I had wasted hours of CPU time building non-functional binaries. Save yourself some frustration. Start with a small binary to test.

A good target for this is busybox. Run mips64el-unknown-linux-gnu-emerge busybox, and wait for a bit.

When it completes, you should hopefully have a busybox binary:

RC=0 stuartl@beast ~ $ file /usr/mips64el-unknown-linux-gnu/bin/busybox 
/usr/mips64el-unknown-linux-gnu/bin/busybox: ELF 64-bit LSB executable, MIPS, MIPS-III version 1 (SYSV), statically linked, for GNU/Linux 3.2.0, stripped

Testing busybox

There is qemu-user-mips64el, but last time I tried it, I found it broken. So an easier option is to use real hardware or QEMU emulating a full system. In either case, you’ll want to ensure you have your system-of-choice running with a working 64-bit kernel already, if your real hardware isn’t already running a 64-bit Linux kernel, use QEMU.

For QEMU, the path-of-least-resistance I found was to use Debian. Aurélien Jarno has graciously provided QEMU images and corresponding kernels for a good number of ports, including little-endian MIPS.

Grab the Wheezy disk image and the corresponding kernel, then run the following command:

# qemu-system-mips64el -M malta \
    -kernel vmlinux-3.2.0-4-5kc-malta \
    -hda debian_wheezy_mipsel_standard.qcow2 \
    -append "root=/dev/sda1 console=ttyS0,115200" \
    -serial stdio -nographic -net nic -net user

Let it boot up, then log in with username root, password root.

Install openssh-client and rsync (this does not ship with the image):

# apt-get update
# apt-get install openssh-client rsync

Now, you can create a directory, and pull the relevant files from your host, then try the binary out:

# mkdir gentoo
# rsync -aP gentoo/
# chroot gentoo bin/busybox ash

With luck, you should be in the chroot now, using Busybox.

Step 3: Building the system

Having done a “hello world” test, we’re now ready to build everything else. Start by tweaking your /usr/mips64el-unknown-linux-gnu/etc/portage/make.conf to your liking then adjust /usr/mips64el-unknown-linux-gnu/etc/portage/make.profile to point to one of the MIPS profiles. For reference, on my system:

RC=0 stuartl@beast ~ $ ls -l /usr/mips64el-unknown-linux-gnu/etc/portage/make.profile
lrwxrwxrwx 1 root root 49 May  1 09:26 /usr/mips64el-unknown-linux-gnu/etc/portage/make.profile -> /usr/portage/profiles/default/linux/mips/13.0/n64
RC=0 stuartl@beast ~ $ cat /usr/mips64el-unknown-linux-gnu/etc/portage/make.conf 



ACCEPT_KEYWORDS="mips ~mips"

USE="${ARCH} -pam"

CFLAGS="-O2 -pipe -fomit-frame-pointer"

FEATURES="-collision-protect sandbox buildpkg noman noinfo nodoc"
# Be sure we dont overwrite pkgs from another repo..



Now, you should be ready to start building:

# mips64el-unknown-linux-gnu-emerge -e \
    --keep-going -j6 --load-average 12.0 @system

Now, go away, and do something else for several hours.  It’ll take that long, depending on the speed of your machine.  In my case, the machine is an AMD Phenom II x6 with 8GB RAM, which was brand new in 2010.  It took a good day or so.

Step 4: Testing our system

We should have enough that we can boot our QEMU VM with this image instead.  One way of trying it would be to copy across the userland tree the same way we did for pulling in busybox and chrooting back in again.

In my case, I took the opportunity to build a kernel specifically for the VM that I’m using, and made up a disk image using the new files.

Building a kernel

Your toolchain should be able to cross-build a kernel for the virtual machine.  To get you started, here’s a kernel config file.  Download it, decompress it, then drop it into your kernel source tree as .config.

Having done that, run make olddefconfig ARCH=mips to set the defaults, then make menuconfig ARCH=mips and customise to your hearts content. When finished, run make -j6 vmlinux modules CROSS_COMPILE=mips64el-unknown-linux-gnu- to build the kernel and modules.

Finally, run make modules_install firmware_install INSTALL_MOD_PATH=$PWD/modules CROSS_COMPILE=mips64el-unknown-linux-gnu- to install the kernel modules and firmware into a convenient place.

Making a root disk

Create a blank, raw disk image using qemu-img, then partition it as you like and mount it as a loopback device:

# qemu-img create -f raw gentoo.raw 8G
# fdisk gentoo.raw
(do your partitioning here)
# losetup -P /dev/loop0 $PWD/gentoo.raw

Now you can format the partitions /dev/loop0pX as you see fit, then mount them in some convenient place. I’ll assume that’s /mnt/vm for now. You’re ready to start copying everything in:

# rsync -aP /usr/mips64el-unknown-linux-gnu/ /mnt/vm/
# rsync -aP /path/to/kernel/tree/modules/ /mnt/vm/

You can use this opportunity to make some tweaks to configuration files, like updating etc/fstab, tweaking etc/portage/make.conf (changing ROOT, removing CBUILD), and setting up a getty on ttyS0. I also like to symlink lib to lib64 in non-multilib environments such as this: Don’t symlink lib and lib64! See below.

# cd /mnt/vm
# mv lib/* lib64
# rmdir lib
# ln -s lib64 lib
# cd usr
# mv lib/* lib64
# rmdir lib
# ln -s lib64 lib

When you’re done, unmount.

First boot

Run QEMU with the following arguments:

# qemu-system-mips64el -M malta \
    -kernel /path/to/your/kernel/vmlinux \
    -hda /path/to/your/gentoo.raw \
    -append "root=/dev/sda1 console=ttyS0,115200 init=/bin/bash" \
    -serial stdio -nographic -net nic -net user

It should boot straight to a bash prompt. Mount the root read/write, and then you can make any edits you need to do before boot, such as changing the root password. When done, re-mount the root as read-only, then exec /sbin/init.

# mount / -o rw,remount
# passwd
… etc
# mount / -o ro,remount
# exec /sbin/init

With luck, it should boot to completion.

Step 5: Making the VM a system service

Now, it’d be real nice if libvirt actually supported MIPS VMs, but it doesn’t appear that it does, or at least I couldn’t get it to work.  virt-manager certainly doesn’t support it.

No matter, we can make do with a telnet console (on loopback), and supervisord to daemonise QEMU.  I use the following supervisord configuration file to start my VMs:

file=/tmp/supervisor.sock   ; (the path to the socket file)

logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB        ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10           ; (num of main logfile rotation backups;default 10)
loglevel=info                ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false               ; (start in foreground if true;default false)
minfds=1024                  ; (min. avail startup file descriptors;default 1024)
minprocs=200                 ; (min. avail process descriptors;default 200)

; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket

command=/usr/bin/qemu-system-mips64el -cpu MIPS64R2-generic -m 2G -spice disable-ticketing,port=5900 -M malta -kernel /home/stuartl/kernels/qemu-mips/vmlinux -hda /var/lib/libvirt/images/gentoo-mips64el.raw -append "mem=256m@0x0 mem=1792m@0x90000000 root=/dev/sda1 console=ttyS0,115200" -chardev socket,id=char0,port=65223,host=::1,server,telnet,nowait -chardev socket,id=char1,port=65224,host=::1,server,telnet,nowait -serial chardev:char0 -mon chardev=char1,mode=readline -net nic -net bridge,helper=/usr/libexec/qemu-bridge-helper,br=br0

The following creates two telnet sockets, port 65223 is the VM’s console, 65224 is the QEMU control console. The VM has the maximum 2GB RAM possible and uses bridged networking to the network bridge br0. There is a graphical console available via SPICE.

All telnet and SPICE interfaces are bound to loopback, so one must use SSH tunnelling to reach those ports from another host. You can change the above command line to use VNC if that’s what you prefer.

At this point, the VM should be able to boot on its own. I’d start with installing some basic packages, and move on from there. You’ll find the environment is very sparse (my build had no Perl binary for example) but the basics for building everything should be there.

You may also find that what is there, isn’t quite installed right… I found that sshd wasn’t functional due to missing users… a problem soon fixed by doing an emerge -K openssh (the earlier step will have produced binary packages).

In my case, that’s installing a decent text editor (vim) and GNU screen so I can start a build, then detach.  Lastly, I’ll need catalyst, which is Gentoo’s release engineering tool.

At the moment, this is where I’m at.  GNU screen has indirectly pulled in Perl as a dependency, and that is building as I type this.  It is building faster than the little netbook does, and I have the bonus that I can throw more RAM at the problem than I can on the real hardware. The plan from here:

  1. emerge -ek @system, to build everything that got missed before.
  2. ROOT=/tmp/seed emerge -eK @system, to bundle everything up into a staging area
  3. populating /tmp/seed/dev with device files
  4. tar-ing up /tmp/seed to make my initial “seed” stage for catalyst.
  5. building the first n64 stages for Gentoo using catalyst
  6. building the packages I want for the netbook in a chroot
  7. transferring the chroot to the netbook

Symlinking lib and lib64… don’t do it!

So, I was doing this years ago when n32 was experimental.  I recall it being necessary then as this was before Portage having proper multilib support.  The earlier mipsel n32 stages I built, which started out from kanaka‘s even more experimental multilib stages, required this kludge to work-around the lack of support in Portage.

Portage has changed, it now properly handles multilib, and so the symlink kludge is not only not necessary, it breaks things rather badly, as I discovered.  When packages merge files to /lib, rather than following the symlink, they’ll replace it with a directory.  At that point, all hell breaks loose, because stuff that “appeared” in /lib before is no longer there.

I was able to recover by rsync-ing /lib64 to /lib, which isn’t a pretty solution, but it’ll be enough to get an initial “seed” stage.  Running that seed stage through Catalyst will clean up the remnants of that bungle.

Nov 122016

So, recently, the North West Digital Radio group generously donated a UDRC II radio control board in thanks for my initial work on an audio driver for the Texas Instruments TLV320AIC3204 (yes, a mouthful).

This board looks like it might support the older Pi model B I had, but I thought I’d play it safe and buy the later revision, so I bought version 3 of the Pi and the associated 7″ touch screen.  Thus, an order went to RS for a whole pile of parts, including one Raspberry Pi3 computer, a blank 8GB MicroSD card, a power supply, the touch screen kit and a case.

Fitting the UDRC

To fit the UDRC, the case will need some of the plastic cut away,  rectangular section out of the main body and a similarly sized portion out of the back cover.

Modifications to the case

Modifications to the case

When assembled, the cut-away section will allow the DB15-HD and Mini-DIN6 connectors to protrude out slightly.

Case assembled with modifications

The UDRC needs some minor modifications too for the touch screen.  Probe around, and you’ll find a source of 5V on one of the unpopulated headers.  You’ll want to solder a two-pin header to here and hook that to the LCD control board using the supplied jumper leads.  If you’ve got one, use a right-angled header, otherwise just bend a regular one like I did.

5V supply for the LCD on the UDRC

5V supply for the LCD on the UDRC

You’ll note I’ve made a note on the DB15-HD, a monitor does NOT plug in here.

From here, you should be ready to load up a SD card.  NWDR recommend the use of Compass Linux, which is a Raspbian fork configured for use with the UDRC.  I used the lite version, since it was smaller and I’m comfortable with command lines.

Configuring screen rotation

If you try to boot your freshly prepared SD card, the first thing you’ll notice is that the screen is up-side-down.  Clearly a few people didn’t communicate with each-other about which way was up on this thing.

Before you pull the SD card out, it is worth mounting the first partition on the SD card and editing config.txt on the root directory of that partition. If doing this on a Windows computer ensure your text editor respects Unix line endings! (Blame Microsoft. If you’re doing this on a Mac, Linux, BSD or other Unix-ish computer, you have nothing to worry about.)

Add the following to the end of the file (or anywhere really):

# Rotate the screen the "right way up"

Now save the file, unmount the SD card, and put it in the Pi before assembling the case proper.

Setting up your environment

Now, if you chose the lite option like I did, there’ll be no GUI, and the touch aspect of the touchscreen is useless.  You’ll need a USB keyboard.

Log in as pi (password raspberry), run passwd to change your password, then run sudo -s to gain a root shell.

You might choose like I did to run passwd again here to set root‘s password too.

After that, you’ll want to install some software.  Your choice of desktop environment is entirely up to you, I prefer something lightweight, and have been using FVWM for years, but there are plenty of choices in Debian as well as the usual suspects (KDE, Gnome, XFCE…).

For the display manager, I’ll choose lightdm. We also need an on-screen keyboard. I tried a couple, including matchbox-keyboard and the rather ancient xvkbd. Despite its age, I found xvkbd to be the most usable.

Once you’ve decided what you want, run apt-get install with your list of packages, making sure to include xvkbd and lightdm in your list.  Other applications I included here were network-manager-gnome, qasmixer, pasystray, stalonetray and gkrellm.

Enabling the on-screen keyboard in lightdm

Having installed lightdm and xvkbd, you can now configure lightdm to enable the accessibility options.

Open up /etc/lightdm/lightdm-gtk-greeter.conf, look for the line show-indicators and tack ;~a11y on the end.

Now down further, look for the commented out keyboard setting and change that to keyboard=xvkbd. Save and close the file, then run /etc/init.d/lightdm restart.

You should find yourself staring at the log-in screen, and lo and behold, there should be a new icon up the top-right. Tapping it should bring up a 3 line menu, the bottom of which is the on-screen keyboard.

On-screen keyboard in lightdm

On-screen keyboard in lightdm

The button marked Focus is what you hit to tell the keyboard which application is to receive the keyboard events.  Tap that, then the application you want.  To log in, tap Focus then the password field.  You should be able to tap your password in followed by either the Return button on the virtual keyboard or the Log In button on the form.

Making FVWM touch-friendly

I have a pretty old configuration that has evolved over the last 10 years using FVWM that was built around keyboard-centric operation and screen real-estate preservation.  This configuration mainly needed two changes:

  • Menus and title bar text enlarged to make the corresponding UI elements finger-friendly
  • Adjusting the size of the FVWM BarButtons to suit the 800×480 display

Rather than showing how to do it from scratch, I’ll just link to the configuration tarball which you are welcome to play with.  It uses xcalendar which isn’t in the Debian repositories any more, but is available on Gentoo mirrors and can be built from source (you’ll want to install xutils-dev for xmake), stalonetray and gkrellm are both in the standard Debian repositories.

FVWM on the Raspberry Pi

FVWM on the Raspberry Pi

Enabling the right-click

This took a bit of hunting to figure out.  There is a method that works with Debian Wheezy which allows right-clicks by way of long presses, but this broke in Jessie, and the 2016-05-23 release of Compass Linux is built on the latter.  So another solution is needed.

Philipp Merkel however, wrote a little daemon called twofing.  Once installed, doing a right click is simply a two-fingered tap on the screen, there’s support for other two-fingered gestures such as pinching and rotation as well.  It is available on Github, and I have forked this, adding some udev rules and scripts to integrate it into the Raspberry Pi.

The resulting Debian package is here.  Download the .deb, run dpkg -i on it, and then re-start the Raspberry Pi (or you can try running udevadm trigger and re-starting X).  The udev rules should create a /dev/twofingtouch symbolic link and the installed Xsession.d/Xreset.d scripts should take care of starting it with X and shutting it down afterwards.

Having done this, when you log in you should find that twofing is running, and that right clicks can be performed using a two-fingered prod.

Finishing up

Having done the configuration, you should now have a usable workhorse for numerous applications.  The UDRC shows up as a second sound card and is accessible via ALSA.  I haven’t tried it out yet, but it at least shows up in the mixer application, so the signs are there.  I’ll be looking to add LinBPQ and FreeDV into the mix yet, to round the software stack off to make this a general purpose voice/data radio station for emergency communications.

May 012016

So, after putting aside the charge controller for now, I’ve taken some time to see if I can get the software side of things into shape.

In the midst of my development, I found a small wiring fault that was responsible for blowing a couple of fuses. A small nick in the sheath of the positive wire in a power cable was letting the crimp part of a DC barrel connector contact +12V. A tweak of that crimp and things are back to normal. I’ve swapped all the 10A fuses for 5A ones, since the regulators are only rated at 7.5A.

The VLANs are assigned now, and I have bonding going between the two pairs of Ethernet devices. In spite of the switch only supporting 4 LAGs, it seems fine with me doing LACP on effectively 10 LAGs. I’ll see how it goes.

The switch has 5 ports spare after plugging in all 5 nodes and a 16-port switch for the IPMI subnet. One will be used for a management interface so I can plug a laptop in, and the others will be paired with LACP for linking to my two existing Cisco SG200-8s.

One of the goals of this project is to try and push the performance of Ceph. In the office, we tried bare Ceph, and found that, while it’s fine for sequential I/O, it suffers a bit with random read/writes, and Windows-based HyperV images like to do a lot of random reads/writes.

Putting FlashCache in the mix really helped, but I note now, it’s no longer maintained. EnhanceIO had only just forked when I tried FlashCache, now it seems that’s the official successor.

There are two alternatives to FlashCache/EnhanceIO: bcache and dm-cache.

I’ll rule out bcache now as it requires the backing image be “formatted” for use. In other words, the backing image is not a raw image, but some proprietary (to bcache) format. This isn’t unworkable, but it raises concerns with me about portability: if I migrate a VM, do I need to migrate its cache too, or is it sufficient to cleanly shut down and detach the bcache device before re-assembling it on the new host?

By contrast, dm-cache and EnhanceIO/FlashCache work with raw backing images, making them much more attractive. Flush the cache before migration or use writethru mode, and all should be fine. dm-cache does however require a separate metadata device: messy, but not unworkable. We can provision the cache-related devices we need using LVM2, and use the kernel-mode Rados block device as our backing image.

So I think my caching subsystem is a two-horse race: dm-cache or EnhanceIO. I guess we’ll give them a try and see how they go.

For those following along at home, if you’re running kernel >4.3, you might want use this fork of EnhanceIO due to changes in the kernel block I/O layer.

To manage the OpenNebula master node, I’ve installed corosync/pacemaker. Normally these are used with DR:BD, however I figure Ceph can fulfil that role. The concepts are similar: it’s a shared block device. I’m not sure if it’ll be LXC, Docker or a VM at this point that “contains” the server, but whatever it is, it should be possible for it to have its root FS and data on Ceph.

I’m leaning towards LXC for this. Time for some more experimentation.