Mar 172018
 

Last night, I got home, having made a detour on my way into work past Jaycar Wooloongabba to replace the faulty PSU.
It was a pretty open-and-shut case, we took it out of the box, plugged it in, and sure enough, no fan.  After the saleswoman asked the advice of a co-worker, it was confirmed that the fan should be running.
It took some digging, but they found a replacement, and so it was boxed up (in the box I supplied, they didn’t have one), and I walked out the door with PSU No. 3.
I had to go straight to work, so took the PSU with me, and that evening, I loaded it into the top box to transport home on the bicycle.
I get home, and it’s first thing on my mind.  I unlock the top box, get it out, and still decked out in my cycling gear, helmet and all (needed the headlight to see down the back of the rack anyway), I get to work.
I put the ring lugs on, plug it into the wall socket and flick the switch.
Nothing.
Toggle the switch on the front, still nothing.
Tried the other socket on the outlet, unplugging the load, still nothing.  Did the 10km trip from Milton to The Gap kill it?
Frustrated, I figure I’ll switch a light on.  Funny… no lights.
I wander into the study… sure enough, the router, modem and switch are dead as doornails.  Wander out to the MDB outside, saw the main breaker was still on, and tried hitting the test button.  Nothing.
I wander back inside, switching the bike helmet for my old hard hat, since it looks as if I’ll need the headlight a bit longer, then take a sticky beak down the road to see if anyone else is facing the same issue.
Sure enough, I look down the street, everyone’s out.
So there goes my second attempt at bootstrapping Gentoo, and my old server’s uptime.
The power did return about an hour or so later.  The PSU was fine, you don’t think of the mains being out as the cause of your problems.
I’ll re-start my build, but I’m not going to lose another build to failing power.  Nope, had enough of that for a joke.
I could have rigged up a UPS to the TS-7670, but I already have one, and it’s in the very rack where it’ll get installed anyway.  Thus, no time like the present to install it.
I’ll have to configure the switch to present the right VLANs to the TS-7670, but once I do that, it’ll be able to take over the role of routing between the management VLAN and the main network.
I didn’t want to do this in a VM because that means exposing the hosts and the VMs to the management VLAN, meaning anyone who managed to compromise a host would have direct access to the BMCs on the other nodes.
This is not a network with high bandwidth demands, and so the TS-7670 with its 100Mbps Ethernet (built into the SoC; not via USB) is an ideal machine for this task.
Having done this, all that’s left to do is to create a 2GB dual-core VM which will receive the contents of the old server, then that server can be shut down, after 8 years of good service.  I’ll keep it around for storing the on-site backups, but now I can keep it asleep and just wake it up with Wake-on-LAN when I want to make a back-up.
This should make a dint in our electricity bill!
Other changes…

  • Looks like we’ll be upgrading the solar with the addition of another 120W panel.
  • I will be hooking up my other network switches, the ADSL router and ADSL modem up to the battery bank on the cluster, just got to get some suitable cable for doing so.
  • I have no faith in this third PSU, so already, I have a MeanWell HEP-600C coming.  We’ll wire up a suicide lead to it, and that can replace the Powertech MP-3089 + Redarc BCDC1225, as the MeanWell has a remote on/off feature I can use to control it.
Mar 152018
 

Perhaps literally… it has bitten the dust.  Although I wouldn’t call its installed location, dusty.  Once again, the fan in the mains power supply has carked it.

Long-term followers of this project may remember that the last PSU failed the same way.

The reason has me miffed.  All I did with the replacement, was take the PSU out of its box, loosen the two nuts for the terminals, slip the ring lugs for my power lead over the terminals, returned the nuts, plugged it in and turned it on.

While it is running 24×7, there is nothing in the documentation to say this PSU can’t run that way.  This is what the installation looks like.

If it were dusty, I’d expect to be seeing hardware failures in my nodes.

This PSU is barely 4 months old, and earlier this week, the fan started making noises, and requiring percussive maintenance to get started. Tonight, it failed. Completely, no taps on the case will convince it to go.

Now, I need to keep things running until the weekend. I need it to run without burning the house down.

Many moons ago, my father bought a 12V fan for the caravan. Cheap and nasty. It has a slider switch to select between two speeds; “fast” and “slow”, which would be better named “scream like a banshee” and “scream slightly less like a banshee”. The speed reduction is achieved by passing current through a 10W resistor, and achieves maybe a 2% reduction in motor RPM. As you can gather, it proved to be a rather unwelcome room mate, and has seen its last day in the caravan.

This fan, given it runs off 12V, has proven quite handy with the cluster. I’ve got my SB-50 “load” socket hanging out the front of the cluster. A little adaptor to bring that out to a cigarette lighter socket, and I can run it off the cluster batteries. When a build job has gotten a node hot and bothered, sitting this down the bottom of the cluster and aiming it at a node has cooled things down well.

Tonight, it has another task … to try and suck the hot air out of the PSU.

That’s the offending power supply.  A PowerTech MP-3089.  It powers the RedARC BCDC-1225 right above it.  And you can see my kludge around the cooling problem.  Not great, but it should hold for the next 24 hours.

Tomorrow, I think we’ll call past Aspley and pick up another replacement.  I’m leery of another now, but I literally have no choice … I need it now.  Sadly, >250W 12V switchmode PSUs are somewhat rare beasts here in Brisbane.  Altronics don’t sell them that big.  The grinning glasses are no more, and I’m not risking it with the Xantrex charger again.

Long term, I’m already looking at the MeanWell SP-480-12.  This is a PSU module, and will need its own case and mains wiring… but I have no faith in the MP-3089 to not fail and cremate my home of 34 years.

The nice feature of the SP-480-12 is that it does have a remote +12V power-off feature.  Presumably I can drive this with a comparator/output MOSFET, so that when the battery voltage drops below some critical threshold, it kicks in, and when it rises above a high set-point, it drops out.  Simple control, with no MCU involved.  I don’t see a reason to get more fancy than that on the control side, anything more is a liability.

On other news, my gcc build on the TS-7670 failed … so much for the wait.  We’ll try another version and see how we go.

Dec 252017
 

So, I’m home now for the Christmas break… and the fan in my power supply decided it would take a Christmas break itself.

The power supply was purchased brand new in June… it still works as a power supply, but with the fan seized up, it represents an overheating risk.  Unfortunately, the only real options I have are the Xantrex charger, which cooked my last batteries, or a 12V 20A linear PSU I normally use for my radio station.  20A is just a touch light-on, given the DC-DC converter draws 25A.  It’ll be fine to provide a top-up, but I wouldn’t want to use it for charging up flat batteries.

Now, I can replace the faulty fan.  However, that PSU is under warranty still, so I figure, back it goes!

In the meantime, an experiment.  What happens if I just turn the mains off and rely on the batteries?  Well, so far, so good.  Saturday afternoon, the batteries were fully charged, I unplugged the mains supply.  Battery voltage around 13.8V.

Sunday morning, battery was down to 12.1V, with about 1A coming in off the panels around 7AM (so 6A being drained from batteries by the cluster).

By 10AM, the solar panels were in full swing, and a good 15A was being pumped in, with the cluster drawing no more than 8A.  The batteries finished the day around 13.1V.

This morning, batteries were slightly lower at 11.9V.   Just checking now, I’m seeing over 16A flowing in from the panels, and the battery is at 13.2V.

I’m in the process of building some power meters based on NXP LPC810s and TI INA219Bs.  I’m at two minds what to use to poll them, whether I use a Raspberry Pi I have spare and buy a case, PSU and some sort of serial interface for it… or whether I purchase a small industrial PC for the job.

The Technologic Systems TS-7670 is one that I am considering, given they’ll work over a wide range of voltages and temperatures, they have plenty of UARTs including RS-485 and RS-232, and while they ship with an old Linux kernel, yours truly has ported both U-Boot and the mainline Linux kernel.  Yes, it’s ARMv5, but it doesn’t need to be a speed demon to capture lots of data, and they work just fine for Barangaroo where they poll Modbus (via pymodbus) and M-bus (via python-mbus).

Nov 192017
 

So, this weekend I did plan to run from solar full time to see how it’d go.

Mother nature did not co-operate.  I think there was about 2 hours of sunlight!  This is what the 24 hour rain map looks like from the local weather radar (image credit: Bureau of Meteorology):

In the end, I opted to crimp SB50 connectors onto the old Redarc BCDC1225 and hook it up between the battery harness and the 40A power supply. It’s happily keeping the batteries sitting at about 13.2V, which is fine. The cluster ran for months off this very same power supply without issue: it’s when I introduced the solar panels that the problems started. With a separate controller doing the solar that has over-discharge protection to boot, we should be fine.

I also have mostly built-up some monitoring boards based on the TI INA219Bs hooked up to NXP LPC810s. I have not powered these up yet, plan is to try them out with a 1ohm resistor as the stand-in for the shunt and a 3V rail… develop the firmware for reporting voltage/current… then try 9V and check nothing smokes.

If all is well, then I’ll package them up and move them to the cluster. Not sure of protocols just yet. Modbus/RTU is tempting and is a protocol I’m familiar with at work and would work well for this application, given I just need to represent voltage and current… both of which can be scaled to fit 16-bit registers easy (voltage in mV, current in mA would be fine).

I just need some connectors to interface the boards to the outside world and testing will begin. I’ve ordered these and they’ll probably turn up some time this week.

Nov 132017
 

So, at present I’ve been using a two-charger solution to keep the batteries at full voltage.  On the solar side is the Powertech MP3735, which also does over-discharge protection.  On the mains side, I’m using a Xantrex TC2012.

One thing I’ve observed is that the TC2012, despite being configured for AGM batteries, despite the handbook saying it charges AGM batteries to a maximum 14.3V, has a happy knack of applying quite high charging voltages to the batteries.

I’ve verified this… every meter I’ve put across it has reported it at one time or another, more than 15V across the terminals of the charger.  I’m using SB50 connectors rated at 50A and short runs of 6G cable to the batteries.  So a nice low-resistance path.

The literature I’ve read says 14.8V is the maximum.  I think something has gone out of calibration!

This, and the fact that the previous set-up over-discharged the batteries twice, are the factors that lead to the early failure of both batteries.

The two new batteries (Century C12-105DA) are now sitting in the battery cases replacing the two Giant Energy batteries, which will probably find themselves on a trip to the Upper Kedron recycling facility in the near future.

The Century batteries were chosen as I needed the replacements right now and couldn’t wait for shipping.  This just happened to be what SuperCheap Auto at Keperra sell.

The Giant Energy batteries took a number of weeks to arrive: likely because the seller (who’s about 2 hours drive from me) had run out of stock and needed to order them in (from China).  If things weren’t so critical, I might’ve given those batteries another shot, but I really didn’t have the time to order in new ones.

I have disconnected the Xantrex TC2012.  I really am leery about using it, having had one bad experience with it now.  The replacement batteries cost me $1000.  I don’t want to be repeating the exercise.

I have a couple of options:

  1. Ditch the idea of using mains power and just go solar.
  2. Dig out the Redarc BCDC1225 charger I was using before and hook that up to a regulated power supply.
  3. Source a new 20A mains charger to hook in parallel with the batteries.
  4. Hook a dumb fixed-voltage power supply in parallel with the batteries.
  5. Hook a dumb fixed-voltage power supply in parallel with the solar panel.

Option (1) sounds good, but what if there’s a run of cloudy days?  This really is only an option once I get some supervisory monitoring going.  I have the current shunts fitted and the TI INA219Bs for measuring those shunts arrived a week or so back, just haven’t had the time to put that into service.  This will need engineering time.

Option (2) could be done right now… and let’s face it, its problem was switching from solar to mains.  In this application, it’d be permanently wired up in boost mode.  Moreover, it’s theoretically impossible to over-discharge the batteries now as the MP3735 should be looking after that.

Option (3) would need some research as to what would do the job.  More money to spend, and no guarantee that the result will be any better than what I have now.

Option (4) I’m leery about, as there’s every possibility that the power supply could be overloaded by inrush current to the battery.  I could rig up a PWM circuit in concert with the monitoring I’m planning on putting in, but this requires engineering time to figure out.

Option (5) I’m also leery about, not sure how the panels will react to having a DC supply in parallel to them.  The MP3735 allegedly can take an input DC supply as low as 9V and boost that up, so might see a 13.8V switchmode PSU as a solar panel on a really cloudy day.  I’m not sure though.  I can experiment, plug it in and see how it reacts.  Research gives mixed advice, with this Stack Exchange post saying yes and this Reddit thread suggesting no.

I know now that the cluster averages about 7A.  In theory, I should have 30 hours capacity in the batteries I have now, if I get enough sun to keep them charged.

This I think, will be a week-end experiment, and maybe something I’ll try this weekend.  Right now, the cluster itself is running from my 40A switchmode PSU, and for now, it can stay there.

I’ll let the solar charger top the batteries up from the sun this week.  With no load, the batteries should be nice and full, ready come Friday evening, when I can, gracefully, bring the cluster down and hook it up to the solar charger load output.  If, at the end of the weekend, it’s looking healthy, I might be inclined to leave it that way.

Jun 292017
 

So, there’s some work still to be done, for example making some extension leads for the run between the battery link harness, load power distribution and the charger… and to generally tidy things up, but it is now up and running.

On the floor, is the 240V-12V power supply and the charger, which right now is hard-wired in boost mode. In the bottom of the rack are the two 105Ah 12V AGM batteries, in boxes with fuses and isolation switches.

The nodes and switching is inside the rack, and resting on top is the load power distribution board, which I’ll have to rewire to make things a little neater. A prospect is to mount some of this on the back.

I had a few introductions to make, introducing the existing pair of SG-200 switches to the newcomer and its VLANs, but now at least, I’m able to SSH into the nodes, access the IPMI BMC and generally configure the whole box and dice.

With the exception of the later upgrade to solar, and the aforementioned wiring harness clean-ups, the hardware-side of this dual hardware/software project, is largely complete, and this project now transitions to being a software project.

The plan from here:

  • Update the OSes… as all will be a little dated. (I might even blow away and re-load.)
  • Get Ceph storage up and running. It actually should be configured already, just a matter of getting DNS hostnames sorted out so they can find eachother.
  • Investigating the block caching landscape: when I first started the project at work, it was a 3-horse race between Facebook’s FlashCache, bcache and dmcache. Well, FlashCache is no more, replaced by EnhancedIO, and I’m not sure about the rest of the market. So this needs researching.
  • Management interfaces: at my workplace I tried Ganeti, OpenNebula and OpenStack. This again, needs re-visiting. OpenNebula has moved a long way from where it was and I haven’t looked at the others in a while. OpenStack had me running away screaming, but maybe things have improved.
Jun 252017
 

Well, it’s been a while since I last updated this project. Lots have been due to general lethargy, real life and other pressures.

This equipment is being built amongst other things to host my websites, mail server, and as a learning tool for managing clustered computing resources. As such, yes, I’ll be putting it down as a work expense… and it was pointed out to me that it needed to be in operation before I could start claiming it on tax. So, with 30th June looming up soon, it was time I pulled my finger out and got it going.

At least running on mains. As for the solar bit, well we will be doing that too, my father recently sent me this email (line breaks for readability):

Subject: Why you're about to pay through the nose for power - ABC News
 (Australian Broadcasting Corporation)
To: Stuart Longland
From: David Longland
http://www.abc.net.au/news/2017-06-19/…
   …why-youre-about-to-pay-through-the-nose-for-power/8629090

Hi Stuart,

This is why I am keen to see your cluster up and running.  Our power 
bill is about $300 every 3 months, a lift in price by 20% represents 
$240pa hike.

Dad

Umm, yeah… good point. Our current little server represents a small portion of our base-load power… refrigeration being the other major component.

I ordered the rack and batteries a few months back, and both have been sitting here, still in the boxes they were shipped in, waiting for me to get to and put them together. My father got fed up of waiting and attacked the rack, putting it together one evening… and last night, we worked together on putting a back on the rack using 12mm plywood.

We also fitted the two switches, mounting the smaller one to the lid of the main switch using multiple layers of double-sided tape.

I wasn’t sure at first where the DIN rail would mount. I had intended to screw it to a piece of 2×4″ or similar, and screw that to the back plane. We couldn’t screw the DIN rail directly to the back plane because the nodes need to be introduced to the DIN rail at an angle, then brought level to attach them.

Considering the above, we initially thought we’d bolt it to the inner run of holes, but two problems presented themselves:

  1. The side panels actually covered over those holes: this was solved with a metal nibbling tool, cutting a slot where the hole is positioned.
  2. The DIN rail, when just mounted at each end, lacked the stability.

I measured the gap between the back panel and the DIN rail location: 45mm. We didn’t have anything that was that width which we could use as a mounting. We considered fashioning a bracket out of some metal strip, but bending it right could be a challenge without the right tools. (and my metalwork skills were never great.)

45mm + 3mm is 48mm… or 4× plywood pieces. We had plenty of off-cut from the back panel.

Using 4 pieces of the plywood glued together and clamped overnight, I made a mounting to which I could mount the DIN rail for the nodes to sit on. This afternoon, I drilled the pilot holes and fitted the screws for mounting that block, and screwed the DIN rail to it.

At the far ends, I made spacers from 3mm aluminium metal strap. The result is not perfect, but is much better than what we had before.

I’ve wired up the network cables… checking the lengths of those in case I needed to get longer cables. (They just fit… phew! $20 saved.) and there is room down the bottom for the batteries to sit. I’ll make a small 10cm cable to link the management network up to the appropriate port on the main switch, then I just need to run cables to the upstairs and downstairs switches. (In fact, there’s one into the area already.)

On the power front… my earlier experiments had ascertained the suitability of the Xantrex charger that we had spare. The charger is a smart charger, and so does various equalisation and balancing cycles, thus gets mightily confused if you suddenly disconnect the battery from it by way of a MOSFET. A different solution presented itself though.

My father has a solar set-up in the back of his car… there’s a 12V 120W panel on the roof, and that provides power to a battery system which powers an amateur radio station and serves as an auxiliary battery. There’s a diode arrangement that allows charging from the vehicle battery system.

In an effort to try and upgrade it, he bought a Redarc BCDC1225 in-vehicle MPPT charger. This charger can accept power from either the 12V mains supply in a vehicle, or from a “12V” solar panel. The key here, is it relies on a changeover relay to switch between the two, and this is where it wasn’t quite suitable for my father’s needs: it assumed that if the vehicle ignition was on, you wanted to charge from the vehicle, not from solar.

He wanted it to switch to whichever source was more plentiful, and had thought the unit would drive the relay itself. Having read the manual, we now know the signal they tell you to connect to the relay coil is there to tell the charger which source it is plugged into, not for it to drive the relay.

The plan is therefore:

  • use a 240V→12V AC-DC switch-mode power supply to provide the “vehicle mains” DC input to the charger.
  • measure the voltage seen at the solar input with a comparator and switch over when it is above some pre-defined voltage (use hysteresis to ensure it doesn’t oscillate)
  • use the output to drive a P-channel MOSFET attached to the “vehicle mains”, which drives the relay.