Mar 112017
 

So, as promised, the re-design of the charge controller. … now under the the influence of a few glasses of wine, so this should be interesting…

As I mentioned in my last post, it was clear that the old logic just wasn’t playing nice with this controller, and that using this controller to maintain the voltage to the nodes below 13.6V was unrealistic.

The absolute limits I have to work with are 16V and 11.8V.

The 11.8V comes from the combination of regulator voltage drop and ATX PSU power range limits: they don’t operate below about 10.8V, if you add 700mV, you get 11.5V … you want to allow yourself some head room. Plus, cycling the battery that deep does it no good.

As for the 16V limit… this is again a limitation of the LDOs, they don’t operate above 16V. In any case, at 16V, the poor LDOs are dropping over 3V, and if the node is running flat chat, that equates to 15W of power dissipation in the LDO. Again, we want some headroom here.

The Xantrex charger likes pumping ~15.4V in at flat chat, so let’s go 15.7V as our peak.

Those are our “extreme” ranges.

At the lower end, we can’t disconnect the nodes, but something should be visible from the system firmware on the cluster nodes themselves, and we can thus do some proactive load shedding, hibernating virtual instances and preparing nodes for a blackout.

Maybe I can add a small 10Mbps Ethernet module to an AVR that can wake the nodes using WOL packets or IPMI requests. Perhaps we shut down two nodes, since the Ceph cluster will need 2/3 up, and we need at least one compute node.

At the high end, the controller has the ability to disconnect the charger.

So that’s worked out. Now, we really don’t want the battery getting that critically low. Thus the time to bring the charger in will be some voltage above the 11.8V minimum. Maybe about 12V… perhaps a little higher.

We want it at a point that when there’s a high load, there’s time to react before we hit the critical limit.

The charger needs to choose a charging source, switch that on, then wait … after a period check the voltage and see if the situation has improved. If there’s no improvement, then we switch sources and wait a bit longer. Wash, rinse, repeat. When the battery ceases to increase in voltage, we need to see if it’s still in need of a charge, or whether we just call it a day and run off the battery for a bit.

If the battery is around 14.5~15.5V, then that’s probably good enough and we should stop. The charger might decide this for us, and so we should just watch for that: if the battery stops charging, and it is at this higher level, just switch to discharge mode and watch for the battery hitting the low threshold.

Thus we can define four thresholds, subject to experimental adjustment:

Symbol Description Threshold
V_{CH} Critical high voltage 15.7V
V_H High voltage 15.5V
V_L Low voltage 12.0V
V_{CL} Critical low voltage 11.8V

Now, our next problem is the waiting… how long do we wait for the battery to change state? If things are in the critical bands, then we probably want to monitor things very closely, outside of this, we can be more relaxed.

For now, I’ll define two time-out settings… which we’ll use depending on circumstances:

Symbol Description Period
t_{LF} Low-frequency polling period 15 sec
t_{HF} High-frequency polling period 5 sec

In order to track the state, I need to define some variables… we shall describe the charger’s state in terms of the following variables:

Symbol Description Initial value
V_{BL} Last-known battery voltage, set at particular points. 0V
V_{BN} The current battery voltage, as read by the ADC using an interrupt service routine. 0V
t_d Timer delay… a timer used to count down until the next event. t_{HF}
S Charging source, an enumeration:

  • 0: No source selected
  • 1: Main charging source (e.g. solar)
  • 2: Back-up charging source (e.g. mains power)
0

The variable names in the actual code will be a little more verbose and I’ll probably use #defines for the enumeration.

Below is the part-state-machine part-flow-chart diagram that I came up with. It took a few iterations to try and describe this accurately, I was going to use a state machine syntax similar to what my workplace uses, but in the end, found the ye olde flow chart shows it best.

In this diagram, a filled in dot represents the entry point, a dot with an X represents an exit point, and a dot in a circle represents a point where the state machine re-enters the state and waits for the main loop to iterate once more.

You’ll note that for this controller, we only care about one voltage, the battery voltage. That said, the controller will still have temperature monitoring duties, so we still need some logic to switch the ADC channel, throw away dummy samples (as per the datasheet) and manage sample storage. The hardware design does not need to change.

We can use quiescent voltages to detect the presence of a charging source, but we do not need to, as we can just watch the battery voltage rise, or not, to decide whether we need to take further action.

Mar 112017
 

So, having knocked the regulation on the LDOs down a few pegs… I am closer to the point where I can leave the whole rig running unattended.

One thing I observed prior to the adjustment of the LDOs was that the controller would switch in the mains charger, see the voltage shoot up quickly to about 15.5V, before going HOLYCRAP and turning the charger off.

I had set a set point at about 13.6V based on two facts:

  • The IPMI BMCs complained when the voltage raised above this point
  • The battery is nominally 13.8V

As mentioned, I’m used to my very simple, slow charger, that trickle charges at constant voltage with maximum current output of 3A. The Xantrex charger I’m using is quite a bit more grunty than that. So re-visiting the LDOs was necessary, and there, I have some good results, albeit with a trade-off in efficiency.

Ahh well, can’t have it all.

I can run without the little controller, as right now, I have no panels. Well, I’ve got one, a 40W one, which puts out 3A on a good day. A good match for my homebrew charger to charge batteries in the field, but not a good match for a cluster that idles at 5A. I could just plug the charger directly into the battery and be done with it for now, defer this until I get the panels.

But I won’t.

I’ve been doing some thought about two things, the controller and the rack. On the rack, I found I can get a cheap one for $200. That is cheap enough to be considered disposable, and while sure it’s meant for DJ equipment, two thoughts come to mind:

  • AV equipment with all its audio transformers and linear power supplies, is generally not light
  • It’s on wheels, meant to be moved around… think roadies and such… not a use case that is gentle on equipment

Thus I figure it’ll be rugged enough to handle what I want, and is open enough to allow good airflow. I should be able to put up to 3 AGM batteries in the bottom, the 3-channel charger bolted to the side, with one charge controller per battery. There are some cheap 30A schottky diodes, which would let me parallel the batteries together to form one redundant power supply.

Downside being that would drop about 20-25W through the diode. Alternatively, I make another controller that just chooses the highest voltage source, with a beefy capacitor bank to handle the switch-over. Or I parallel the batteries together, something I am not keen to do.

I spent some time going back to the drawing board on the controller. The good news, the existing hardware will do the job, so no new board needed. I might even be able to simplify logic, since it’s the battery voltage that matters, not the source voltages. But, right now I need to run. So perhaps tomorrow I’ll go through the changes. 😉

Jan 292017
 

Well, I’ve finally dragged this project out and plugged everything in to test the new power modules out.

I’ll be hooking up the laptop and getting the nodes to do something strenuous in a moment, but for now, I have them just idling on the battery, with the battery charger being switched by the charge controller, built around an ATTiny24A and this time, a separate power module rather than it being integrated.

I’ve had it going for a few hours now… and so far, so good. The PSU is getting turned on and off more often than I’d like, but at least the smoke isn’t escaping now. The heatsink for the power modules is warm, but still not at the “burn your fingers off” stage.

That to me suggests the largish heatsink was the right one to use.

Two things I need to probably address though:

  • In spite of the LDOs, the acceptable voltage range of the computers is still rather narrow… I’m not sure if it’s just the IPMI BMC being fussy or if the LDOs need to be knocked down a peg to keep the voltage within limits. Perhaps I should use the same resistor values as I did for the Ethernet switch.
  • The thresholds seem to get reached very quickly which means the timeouts still need lengthening. Addressing the LDO settings should help with this, as it’ll mean I can bump my thresholds higher.

If I can nail those last two issues, then I might be at risk of having the hardware aspect of this project done and having a workable cluster to do the software side of the project. Shock horror!

Apr 162016
 

I figured, rather than letting these loose directly on the nodes themselves, I’d give them a try with a throw-away dummy load. For this, I grabbed an old Philips car cassette player that’s probably older than I am and hooked that up. I shoved some old cassette in.

The datasheet for the regulators defines the output voltage as: V_{OUT}=1.240 \big({R_1 \over R_2} + 1\big)

Playing with some numbers, I came out with R1 being a 2.7kΩ and 560Ω resistors in series, and R2 being a 330Ω. So I scratched around for those resistors, grabbed one of the MIC29172s and hooked it all up for a test.

The battery here is not the one I’ll use in the cluster ultimately, I have a 100Ah AGM cell battery for that. The charger seen there is what will be used, initially as a sole power source, then in combination with solar when I get the panels. It’s capable of producing 20A of current, and runs off mains power.

This is the power drain from the battery, with the charger turned on.

Not as much as I thought it’d be, but still a moderate amount.

This is what the output side of the regulator looked like:

So from 14.8V down to 13.1V. It also showed 13.1V when I had the charger unplugged, so it’s doing its job pretty well I think. That’s a drop of 1.7V, so dissipating about 600mW. Efficiency is therefore about 93%, not bad for linear regulators.

Apr 092016
 

So, I’ve been doing a bit of research about how I can stabilise the battery voltage which will drift between around 11V and 14.6V. It’s a deep-cycle type battery, so it’s actually capable of going down to 10V, but I really don’t want to push it that far.

Once I get below 12V, that’s the time to signal to the VM hosts to start hibernating the VMs and preparing for a blackout, until such time as the voltage picks back up again.

The rise above 13.5V is a challenge due to the PicoPSU limitations. @Vlad Conut rightly pointed out that the M3-ATX-HV PSUs sold by the same company would have been a better choice. For about $20 more, or an extra $100 for the cluster, I’d have something that’d work 6-30V. I’d still have to solve the problem with the switch, but it’d just be that one device, not 6.

Maybe it was because they were out of stock that I went the PicoPSU route, I also wasn’t sure about power demands, I knew the CPU needed 20W, but wasn’t sure about everything else. So I over-dimensioned everything. Hindsight is 20:20.

One option I considered was a regulator in front of each node. I had mentioned the LM7812 because I knew of it. I also knew it was a crap choice for this task, the 1.5V drop, with a 5A load would result in about 7.5W dissipated thermally. So 20W would jump to nearly 28W — not good.

That of course assumes a 7812 would handle 5A, which it won’t.

LDOs were the way to go for anything linear, otherwise I was looking at a switchmode power supply. The LM2576 has similar requirements to the LM7812, but is much more efficient being a buck converter. If 1.5V was fine, I’d be looking for a 5A-capable equivalent.

The other option would be to have one single power supply regulate power for all nodes. I mentioned in my previous log about the Redarc DC-DC power supply, and that is certainly still worthy of consideration. It is a single point of failure however, but then again, Redarc aren’t known for making crap, and the unit has a 2 year warranty anyway. I’d have downtime, but would not lose data if this went down.

@K.C. Lee pointed me to one LDO that would be a good candidate though, and is cheap enough to be worth the experiment: the Micrel MIC29750. 7.5A current, and comes in an adjustable form up to 16V. I’d imagine if I set this near 13.5V, it’d dissipate maybe 2.5W max at 5A, or 1W at 2A. Much better.

Not as good as Redarc’s solution of course, and that’s still an option, but cheap enough to try out.