Apr 302016
 

Well, not sure what went wrong, but the controller I built on Monday evening, dead-bug style, is one big fail.

There’s no output from the LM311s, even after adding pull-ups, they still don’t seem to respond to the battery voltage falling below the threshold. Add to that, a faulty IRF540N MOSFET (drain-source resistance of ~40Ω), and you’ve got all the makings of things going wrong.

So time for a U-turn, after deciding against doing a microcontroller-based solution before on the grounds I had the parts on hand to do an analogue comparator solution, I’ve decided I’ll do it with a ATTiny24A after all. I can get these for about $12 for a pack of 5 from a local supplier.

I also have placed on order, two 5V switchmode PSU modules and four P-channel MOSFETs: we’ll drop the relay as well and make it all solid-state.

The MCU doesn’t have to do much, just take an ADC reading every 100msec of the battery voltage, compare it to a threshold then either turn on or turn off the power.

The MCU has up to 6 ADC channels, embeds a small temperature sensor, has one PWM channel and a number of GPIOs. Reserving the reset and SPI lines for ISP work, that gives us 3 digital outputs and one PWM for controlling things and 4 ADC channels.

I can use the PWM channel to drive a MOSFET for the fans, one of the outputs to drive NPN transistors for controlling the ACPI power buttons on the nodes, and two MOSFETs for the mains and solar inputs. 3 ADCs can monitor the battery, mains and solar inputs, so decisions can be made on whether to switch between solar/mains or to turn off all inputs and let the battery drain for a bit.

The internal temperature sensor can be used for fan control. The internal 8MHz oscillator will be “good enough” I think. It mainly needs to tell the difference between hot and cold. If things are >25°C, then we should run the fans, the hotter it is, the faster they should run.

This isn’t rocket-science, and should be achievable via a simple while loop in C.

Apr 272016
 

It seems good old “common courtesy” is absent without leave, as is “common sense”. Some would say it’s been absent for most of my lifetime, but to me it seems particularly so of late.

In particular, where it comes to the safety of one’s self, and to others, people don’t seem to actually think or care about what they are doing, and how that might affect others. To say it annoys me is putting it mildly.

In February, I lost a close work colleague in a bicycle accident. I won’t mention his name, as I do not have his family’s permission to do so.

I remember arriving at my workplace early on Friday the 12th before 6AM, having my shower, and about 6:15 wandering upstairs to begin my work day. Reaching my desk, I recall looking down at an open TS-7670 industrial computer and saying out aloud, “It’s just you and me, no distractions, we’re going to get U-Boot working”, before sitting down and beginning my battle with the machine.

So much for the “no distractions” however. At 6:34AM, the office phone rings. I’m the only one there and so I answer. It was a social worker looking for “next of kin” details for a colleague of mine. Seems they found our office details via a Cab Charge card they happened to find in his wallet.

Well, first thing I do is start scrabbling for the office directory to get his home number so I can pass the bad news onto his wife only to find: he’s only listed his mobile number. Great. After getting in contact with our HR person, we later discover there isn’t any contact details in the employee records either. He was around before such paperwork existed in our company.

Common sense would have dictated that one carry an “in case of emergency” number on a card in one’s wallet! At the very least let your boss know!

We find out later that morning that the crash happened on a particularly sharp bend of the Go Between Bridge, where the offramp sweeps left to join the Bicentennial bikeway. It’s a rather sharp bend that narrows suddenly, with handlebar-height handrails running along its length and “Bicycle Only” signs clearly signposted at each end.

Common sense and common courtesy would suggest you slow down on that bridge as a cyclist. Common sense and common courtesy would suggest you use the other side as a pedestrian. Common sense would question the utility of hand rails on a cycle path.

In the meantime our colleague is still fighting for his life, and we’re all holding out hope for him as he’s one of our key members. As for me, I had a network to migrate that weekend. Two of us worked the Saturday and Sunday.

Sunday evening, emotions hit me like a freight train as I realised I was in denial, and realised the true horror of the situation.

We later find out on the Tuesday, our colleague is in a very bad way with worst-case scenario brain damage as a result of the crash. From shining light to vegetable, he’d never work for us again.

Wednesday I took a walk down to the crash site to try and understand what happened. I took a number of photographs, and managed to speak to a gentleman who saw our colleague being scraped off the pavement. Even today, some months later, the marks on the railings (possibly from handlebar grips) and a large blood smear on the path itself, can still be seen.

It was apparent that our colleague had hit this railing at some significant speed. He wasn’t obese, but he certainly wasn’t small, and a fully grown adult does not ricochet off a metal railing and slide face-first for over a metre without some serious kinetic energy involved.

Common sense seems to suggest the average cyclist goes much faster than the 20km/hr collision the typical bicycle helmet is designed for under AS/NZS 2063:2008.

I took the Thursday and Friday off as time-in-lieu for the previous weekend, as I was an emotional wreck. The following Tuesday I resumed cycling to work, and that morning I tried an experiment to reproduce the crash conditions. The bicycle I ride wasn’t that much different to his, both bikes having 29″ wheels.

From what I could gather that morning, it seemed he veered right just prior to the bend then lost control, listing to the right at what I estimated to be about a 30° angle. What caused that? We don’t know. It’s consistent with him dodging someone or something on the path — but this is pure speculation on my part.

Mechanical failure? The police apparently have ruled that out. There’s not much in the way of CCTV cameras in the area, plenty on the pedestrian side, not so much on the cycle side of the bridge.

Common sense would suggest relying on a cyclist to remember what happened to them in a crash is not a good plan.

In any case, common sense did not win out that day. Our colleague passed away from his injuries a little over a fortnight after his crash, aged 46. He is sadly missed.

I’ve since made a point of taking my breakfast down to that point where the bridge joins the cycleway. It’s the point where my colleague had his last conscious thoughts.

Over the course of the last few months, I’ve noticed a number of things.

Most cyclists sensibly slow down on that bend, but a few race past at ludicrous speed. One morning, I nearly thought they’d be an encore performance as two construction workers on City Cycle bikes, sans helmets, came careening around the corner, one almost losing it.

Then I see the pedestrians. There’s a well lit, covered walkway, on the opposite side of the bridge for pedestrian use. It has bench seats, drinking fountains, good lighting, everything you’d want as a pedestrian. Yet, some feel it is not worth the personal exertion to take the 100m extra distance to make use of it.

Instead, they show a lack of courtesy by using the bicycle path. Walking on a bicycle path isn’t just dangerous to the pedestrian like stepping out onto a road, it’s dangerous for the cyclist too!

If a car hits a pedestrian or cyclist, the damage to the occupants of the car is going to be minimal to nonexistent, compared to what happens to the cyclist or pedestrian. If a cyclist or motorcyclist hits a pedestrian however, they surround the frame, thus hit the ground first. Possibly at significant speed.

Yet, pedestrians think it is acceptable to play Russian roulette with their own lives and the lives of every cycle user by continuing to walk where it is not safe for them to go. They’d never do it on a motorway, but somehow a bicycle path is considered fair game.

Most pedestrians are understanding, I’ve politely asked a number to not walk on the bikeway, and most oblige after I point out how they get to the pedestrian walkway.

Common sense would suggest some signage on where the pedestrian can walk would be prudent.

However, I have had at least two that ignored me, one this morning telling me to “mind my own shit”. Yes mate, I am minding “my own shit” as you put it: I’m trying to stop the hypothetical me from possibly crashing into the hypothetical you!

It’s this sort of reaction that seems symbolic of the whole “lack of common courtesy” that abounds these days.

It’s the same attitude that seems to hint to people that it’s okay to park a car so that it blocks the footpath: newsflash, it’s not! I know of one friend of mine who frequently runs into this problem. He’s in a wheelchair — a vehicle not known for its off-road capabilities or ability to squeeze past the narrow gap left by a car.

It seems the drivers think it’s acceptable to force footpath users of all types, including the elderly, the young and the disabled, to “step out” onto the road to avoid the car that they so arrogantly parked there. It makes me wonder how many people subsequently become disabled as a result of a collision caused by them having to step around such obstacles. Would the owner of the parked car be liable?

I don’t know, I’m no lawyer, but I should think they should carry some responsibility!

In Queensland, pedestrians have right-of-way on the footpath. That includes cyclists: cyclists of all ages are allowed there subject to council laws and signage — but once again, they need to give way. In other words, don’t charge down the path like a lunatic, and don’t block it!

No doubt, the people who I’m trying to convince are too arrogant to care about the above, and what their actions might have on others. Still, I needed to get the above off my chest!

Nothing will bring my colleague back, a fact that truly pains me, and I’ve learned some valuable lessons about the sort of encouragement I give people. I regret not telling him to slow down, 5 minutes longer wouldn’t have killed him, and I certainly did not want a race! Was he trying to race me so he could keep an eye on me? I’ll never know.

He was a bright person though, it is proof though that even the intelligent among us are prone to possibly doing stupid things. With thrills come spills, and one might question whether one’s commute to work is the appropriate venue for such thrills, or whether those can wait for another time.

I for one have learned that it does not pay to be the hare, thus I intend to just enjoy the ride for what it is. No need to rush, common sense tells me it just isn’t worth it!

Apr 252016
 

Well, having gotten the output of the battery sorted out, now it’s time to turn my attention to the input side, namely managing the battery voltage and two possible charge sources.

Now, I have a second-hand Xantrex 20A charger kicking around that I plan to use for when the sun isn’t around and my battery is getting low. When the sun’s out though, I plan to let that charge the battery. I could do this with a small MCU, and I did briefly consider whether I used an ATTiny24A to do it, or one of my spare ATMega8Ls.

I have a beefy 30A relay that can connect and disconnect the charger as needed, it’s a matter of having a controller that decides when it’s needed. I’m not looking for PWM control, the charger will do that itself.

There are two thresholds I want to consider:

  • The low threshold: about 11.5V or so.
  • The high threshold: about 14V.

We want to not let the battery get much below 11.5V as the regulators on the compute nodes will drop up to 700mV and the IPMI BMCs will start to get grumpy. Likewise, they complain when they see more than 13.5V. The regulators should look after it, but let’s not stress them too hard.

I could use a single comparator with hysteresis to do the above, by selecting a reference voltage mid-way between 11.5 and 14V, and setting resistors to set the threshold gap. I’ve decided to just use two comparators, so I can use a LM393, or I have a LM339 kicking around. I also dug around in the junk box and found a stack of MM74C76s, some MM74C221Ns.

Some tinkering with a breadboard, and I came up with this:

Now, the beauty of this set-up, is that I’m using half of each IC, so I effectively have two independent controllers on the one board. Thresholds can be tweaked on each one so that one charger starts sooner than the other, maybe I kick the solar in when battery drops below 12V and let it go to 14V, the mains charger kicks in when we get to 11.5V and stops when we reach 13V.

I haven’t decided on a regulator, yes I could use a LM78C05, the low-power version of the LM7805, as the power drain of this is going to be tiny and headroom enormous for 5V. There are probably better options, I’ll have to shop around, although for a quick prototype, I might just use the LM78C05s since they’re on hand.

Apr 232016
 

Well, I finally got busy with the soldering iron again. This time, installing the regulators in the cluster nodes and in the 26-port switch.

I had a puzzle as to where to put the regulator, I didn’t want it exposed, as they’re a static-sensitive device, so better to keep them enclosed. It needed somewhere where the air would be flowing, and looking around, I found the perfect spot, just in behind the CPU heatsink. There’s a small gap where the air will be flowing past to cool the CPU, and it’s sufficiently near the ATX PSU to feed the power cabling past.

I found I was able to tap M3 threads into the tops of the heatsinks and fix them to the “front” of the case near where the DIN rail brackets fit in. So from the outside, it looks all neat and tidy.

After installing those, I turned my attention to the switch. Now I had an educated guess that the switch would be stepping down from 12V, so being close to that was not so critical, however going above it would stretch the friendship.

Rather than feeding it 13.1V like the compute nodes, I decided I’d find some alternate resistor values that’d be closer to 12V. Those wound up being R1=3.3kΩ and R2=390Ω, which gave about 11.8V. Close enough. It was then a matter of polarity. The wiring inside this switch uses a non-standard colour code, and as I suspected, the conductors are just paralleled, it’s the one feed of 12V.

Probing with a multimeter revealed the pin pairs were shorted, and removing the PSU confirmed this. I pulled out the switch mainboard and probed around the electrolytics which had their negative sides marked. Sure enough, it’s the Australian Olympic team colours that give away the 0V side.

I’ve shown the original colour code here as coloured dots, but essentially, green and yellow are the 0V side, and red and black are the +12V side. So I had everything necessary. I grabbed a bit of scrap PCB, used the old PSU as a template for drilling out the holes, used a hacksaw to divide the PCB surface up then dead-bugged the rest. To position the heatsink, I drilled a 3mm hole in the bottom of the case and screwed a 10mm M3 stand-off there. Yes, this means there’s an annoying lump on the bottom, I should use a countersunk M3 screw, I’ll fix that later if it bothers me, I’ll be rack-mounting it anyway.

On the input to the regulator, I have a 330uF electrolytic capacitor and 100nF monolithic capacitor in parallel, on the output, it’s a 470uF and a 100nF. A third 100nF hooks the adjust pin to 0V to reduce noise. I de-soldered the original PSUs socket and used that on the new board. It fits beautiful. 100-240V? Not any more Linksys.

So now, the whole lot runs off a single 12V battery supply. The remainder of this project is the charging of that battery and the software configuration of the cluster.

At present, the whole cluster’s doing an `emerge @system`, with distcc running, and drawing about 7.5A with the battery sitting at 12.74V (~95W). Edit: Now that they’ve properly fired up, I’m seeing a drain of 10.3A (126W). Looks that’s going to be the “worst case scenario”.

Apr 162016
 

I figured, rather than letting these loose directly on the nodes themselves, I’d give them a try with a throw-away dummy load. For this, I grabbed an old Philips car cassette player that’s probably older than I am and hooked that up. I shoved some old cassette in.

The datasheet for the regulators defines the output voltage as: V_{OUT}=1.240 \big({R_1 \over R_2} + 1\big)

Playing with some numbers, I came out with R1 being a 2.7kΩ and 560Ω resistors in series, and R2 being a 330Ω. So I scratched around for those resistors, grabbed one of the MIC29172s and hooked it all up for a test.

The battery here is not the one I’ll use in the cluster ultimately, I have a 100Ah AGM cell battery for that. The charger seen there is what will be used, initially as a sole power source, then in combination with solar when I get the panels. It’s capable of producing 20A of current, and runs off mains power.

This is the power drain from the battery, with the charger turned on.

Not as much as I thought it’d be, but still a moderate amount.

This is what the output side of the regulator looked like:

So from 14.8V down to 13.1V. It also showed 13.1V when I had the charger unplugged, so it’s doing its job pretty well I think. That’s a drop of 1.7V, so dissipating about 600mW. Efficiency is therefore about 93%, not bad for linear regulators.

Apr 102016
 

A couple of people have suggested I have a look at the ICD-10-AM codes, as this is how a lot of stats are actually recorded.

The trouble was getting hold of a copy of ICD-10-AM to look at in order to determine what is of interest. It’s a very heavily guarded secret apparently. They’re derived from the ICD-10 standard codes.

As it turns out, there’s a site that provides the ICD-10-CM codes .

The codes appear to be similar for things like head injury, so perhaps this will be “close enough”? For privacy reasons, I might not get deep enough for the differences to be significant, but this is a starting point and will at least get us most of the way there.

The ones that appear to be of interest seem to be the following:

Code Description Reasoning
S12 Neck Fractures (vertebrae) Too-heavy protective gear may be a factor in whiplash injury, need to ensure we don’t make other problems worse.
S14 Nerve injuries in Neck Again, related to whiplash and other neck-related injuries.
S16 Tendons/muscles in neck Once again, whiplash and similar conditions
S02 Skull fractures This is the area helmets are supposed to protect!
S06 Brain injuries Again, what helmets are supposed to protect!

I haven’t gone to the deeper levels for two reasons:

  • There may be privacy issues going too much deeper
  • The stuff I’ll be looking at will be according to the ICD-10-AM system, which may have slightly different designators at the lower levels.
Apr 092016
 

One elephant in the room, is how I’m going to store the system whilst in operation.

The obvious solution is some sort of metal cabinet with provision for 19″ rack mounting and DIN rail equipment. Question is, how big?

A big consideration here is thermal matters. When going flat out, there will be 100W-150W worth of thermal energy being dissipated in there. So room for convection currents is a must!

Some decent fans on the top to suck the hot air out would also be a good idea. Blowing up so that dust doesn’t get sucked down into the works.

I figured I’d sit everything sort-of in situ. I figured out that the DIN rail mounts don’t have to go on the bottom, with these cases, if you remove the front panel there’s four holes for mounting those same DIN rail mounts on the front. So that’s what I’ve done. I’ve now got a DIN rail spare for future expansion.

If I try to pack everything up as densely as possible (not wise), this is what it looks like:

There’s room there for possibly one more node to squeeze in there. I’d think that’d be pushing it however. 5 is probably a good number, meaning we can space the units out a bit to allow them to draw air in via the gaps.

On top of the units I have my two switches. The old Netcomm 24-port switch was retired from our network when a lightning strike to a neighbour’s tree an 8-port switch, my Yaesu FT-897D radio transceiver, some ports on a wireless 3G router/switch, and an ADSL router out. It also did damage some ports on the big Netcomm switch, so in short, I know it has issues.

Replacing its 3.3V PSU with one that steps down from 12V would cost me the price of a 16-port 10/100Mbps switch brand new.

When we replaced the switch (paid for by insurance) we decided to buy a 8-port and 16-port switch. The 16-port switch, retired due to an upgrade to gigabit, is sitting on top, and takes 12V 1A input. It’ll be perfect for the IPMI VLAN, where speed is not important. It also accepts the DC plugs I bought by mistake.

The 8-port one takes 7.5V 1A, so a little less convenient for this task, I’d need to make a DC-DC converter for it. Maybe later if this works.

So considering a cabinet for this, we have:

  • 5 nodes measuring 190mm in height: ~5 RU
  • A 24-port switch: 1 RU
  • A 16-port switch: 1 RU
  • Some power distribution electronics: 3RU

Yes, the battery and its charger is external to the cabinet.

Judging from this, the cabinet probably needs to be a 10RU or 12RU cabinet to give us space for mounting everything cleanly and to ensure good ventilation. Using 8-port IPMI switches and 24+2-port comms switches, that leaves us with sufficient port space for the 5 nodes and gives us one port left for a small in-chassis monitoring device and 4 ports left on the main switch for an uplink trunk.

You could conceptually then consider these as homogeneous building blocks for larger networks, using Ceph’s CRUSH maps to ensure copies get distributed amongst these “cabinets”.

Apr 092016
 

So, I’ve been doing a bit of research about how I can stabilise the battery voltage which will drift between around 11V and 14.6V. It’s a deep-cycle type battery, so it’s actually capable of going down to 10V, but I really don’t want to push it that far.

Once I get below 12V, that’s the time to signal to the VM hosts to start hibernating the VMs and preparing for a blackout, until such time as the voltage picks back up again.

The rise above 13.5V is a challenge due to the PicoPSU limitations. @Vlad Conut rightly pointed out that the M3-ATX-HV PSUs sold by the same company would have been a better choice. For about $20 more, or an extra $100 for the cluster, I’d have something that’d work 6-30V. I’d still have to solve the problem with the switch, but it’d just be that one device, not 6.

Maybe it was because they were out of stock that I went the PicoPSU route, I also wasn’t sure about power demands, I knew the CPU needed 20W, but wasn’t sure about everything else. So I over-dimensioned everything. Hindsight is 20:20.

One option I considered was a regulator in front of each node. I had mentioned the LM7812 because I knew of it. I also knew it was a crap choice for this task, the 1.5V drop, with a 5A load would result in about 7.5W dissipated thermally. So 20W would jump to nearly 28W — not good.

That of course assumes a 7812 would handle 5A, which it won’t.

LDOs were the way to go for anything linear, otherwise I was looking at a switchmode power supply. The LM2576 has similar requirements to the LM7812, but is much more efficient being a buck converter. If 1.5V was fine, I’d be looking for a 5A-capable equivalent.

The other option would be to have one single power supply regulate power for all nodes. I mentioned in my previous log about the Redarc DC-DC power supply, and that is certainly still worthy of consideration. It is a single point of failure however, but then again, Redarc aren’t known for making crap, and the unit has a 2 year warranty anyway. I’d have downtime, but would not lose data if this went down.

@K.C. Lee pointed me to one LDO that would be a good candidate though, and is cheap enough to be worth the experiment: the Micrel MIC29750. 7.5A current, and comes in an adjustable form up to 16V. I’d imagine if I set this near 13.5V, it’d dissipate maybe 2.5W max at 5A, or 1W at 2A. Much better.

Not as good as Redarc’s solution of course, and that’s still an option, but cheap enough to try out.

Apr 072016
 

Well, I’ve been researching the problem. I have a battery that could be floating anywhere from 10V to 14.6V, depending on the input from the charger.

I have a computer PSU, that is not happy with voltages outside of 10.5V—13.5V.

What are my options?

  • Linear regulator: the standard ones have a 1.5V drop across them, which at the full rated input current of the PSU, 8A is 12W. Per node.
  • Low drop-out regulators get a little lower, but I’d still lose a few watts per node.
  • Buck converters can do better, but a lot still need at least 1V difference.

So I really need to boost it first, then regulate down. One thing I was not looking forward to, was designing then winding the transformer/inductor needed. An off-the-shelf solution therefore seems attractive, even if I miss out on kudos points for a DIY solution.

Redarc make a couple, and this unit looks like it’ll do exactly what is needed. Not cheap, but it seems comparable to what I’ve seen elsewhere. I’ll have a look and see what else there is, but this might be the most time-economical way to solve the problem and the efficiency is pretty good.

Apr 032016
 

Of course, there’s always something there to throw a spanner in the works, and for me, it’s the PicoPSU.

It seems to work great, however, there is an Achilles heel with these things: they have a fairly narrow band of tolerable voltages they’ll operate at. Namely 10.5—13.5V.

Now, 10.5V is fair enough, but 13.5V? Typical lead acid batteries are 13.8V nominal voltage, and will get to 14.5V when charging. So I need some preregulator that will handle when the voltage is up around 13.5V or above, and drop it down just a little, passing through up to 2A (5A to be safe).

It still has to be stable when the current changes, “turned off” on these computers means a drain of about 200mA for the IPMI. So our operating parameters are summed up as 10.5—13.5V and 200mA—5A.

It needs to continue operating when the battery gets to ~11.5V.

So what are my options?

  • LM2576 simple switcher? 12V in will produce 10.5V out.
  • LM7812 has the same problem, and will chew more power.

A series regulator built on a zener/NPN might work. The voltage drop across the NPN ordinarily is going to be fairly low, however there will still be a drop of about 0.7V or so. That’s possibly “good enough”, since at 11.5V input, we should still see about 10.8V out which is within range.

Two diodes in series, with a relay to short them out when the voltage drops below 12V would work too. That’d need a comparator and voltage reference to drive the relay. It’s a cheap solution too.

Another prospect is a beefy DC-DC converter on the battery, so we don’t actually care what the battery voltage is, we boost it say to 15V then regulate it back down to 12V. A 30A-capable flyback or boost-buck converter would do it. This is more complex, and much more expensive to do off-the-shelf, so I think that’d be a method of last resort.