Jul 062017
 

So, since my last log, I’ve managed to tidy up the wiring on the cluster, making use of the plywood panel at the back to mount all my DC power electronics, and generally tidying everything up.

I had planned to use a SB50 connector to connect the cluster up to the power supply, so made provisions for this in the wiring harness. Turns out, this was not necessary, it was easier in the end to just pull apart the existing wiring and hard-wire the cluster up to the charger input.

So, I’ve now got a spare load socket hanging out the front, which will be handy if we wind up with unreliable mains power in the near future since it’s a convenient point to hook up 12V appliances.

There’s a solar power input there ready, and space to the left of that to build a little control circuit that monitors the solar voltage and switches in the mains if needed. For now though, the switching is done with a relay that’s hard-wired on.

Today though, I managed to get the Ceph clients set up on the two compute nodes, and while virt-manager is buggy where it comes to RBD pools. In particular, adding a RBD storage pool doesn’t work as there’s no way to define authentication keys, and even if you have the pool defined, you find that trying to use images from that pool causes virt-manager to complain it can’t find the image on your local machine. (Well duh! This is a known issue.)

I was able to find a XML cheat-sheet for defining a domain in libvirt, which I was then able to use with Ceph’s documentation.

A typical instance looks like this:

<domain type='kvm'>
  <!-- name of your instance -->
  <name>instancename</name>
  <!-- a UUID for your instance, use `uuidgen` to generate one -->
  <uuid>00ec9b97-c49a-45f8-befe-f74ad6bde2fe</uuid>
  <memory>524288</memory>
  <vcpu>1</vcpu>
  <os>
    <type arch="x86_64">hvm</type>
  </os>
  <clock sync="utc"/>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='network' device='disk'>
      <source protocol='rbd' name="poolname/image.vda">
        <!-- the hostnames or IPs of your Ceph monitor nodes -->
        <host name="s0.internal.network" />
        <host name="s1.internal.network" />
        <host name="s2.internal.network" />
      </source>
      <target dev='vda'/>
      <auth username='libvirt'>
        <!-- the UUID here is what libvirt allocated when you did
	    `virsh secret-define foo.xml`, use `virsh secret-list`
	    if you've forgotten what that is. -->
        <secret type='ceph' uuid='23daf9f8-1e80-4e6d-97b6-7916aeb7cc62'/>
      </auth>
    </disk>
    <disk type='network' device='cdrom'>
      <source protocol='rbd' name="poolname/image.iso">
        <!-- the hostnames or IPs of your Ceph monitor nodes -->
        <host name="s0.internal.network" />
        <host name="s1.internal.network" />
        <host name="s2.internal.network" />
      </source>
      <target dev='hdd'/>
      <auth username='libvirt'>
        <secret type='ceph' uuid='23daf9f8-1e80-4e6d-97b6-7916aeb7cc62'/>
      </auth>
    </disk>
    <interface type='network'>
      <source network='default'/>
      <mac address='11:22:33:44:55:66'/>
    </interface>
    <graphics type='vnc' port='-1' keymap='en-us'/>
  </devices>
</domain>

Having defined the domain, you can then edit it at will in virt-manager. I was able to switch the network interface over to using virtio, plop it on a bridge so it was wired up to the correct VLAN and start the instance up.

I’ve since managed to migrate 3 instances over, namely an estate database, Brisbane Area WICEN’s OwnCloud site, and my own blog.

These are sufficient to try the system out. I’m already finding these instances much more responsive, using raw Ceph even, than the original server.

My next move I think will be to see if I can get corosync/heartbeat to manage a HA VM instance. That is, if one of the compute nodes goes offline, the instance restarts on the other compute node.

Two services come to mind where HA is concerned: terminating the PPPoE link for our Internet, and a virtual management node for a higher-level system such as OpenNebula. OpenNebula really needs something semi-HA, since it really gets its knickers in a twist if the master node goes down. I also want my border router to be HA, since I won’t necessarily be around to migrate it to a different node.

Everything else, well I suspect OpenNebula can itself manage those, and long term the instances I just liberated today from my old box, will become instances within OpenNebula.

The other option is I dip my toe into OpenStack (again), since it is inherently HA by design, but it is also a royal pain to get working.