Redhatter (VK4MSL)

Jun 142018
 

So, last Sunday we did a trip up the Brisbane Valley to do a rekkie for the Yarraman to Wulkuraka bike ride that Brisbane WICEN will be assisting in at the end of next month.

The area is known to be quite patchy where phone reception is concerned, with Linville shown to be highly unreliable… Telstra recommends external antennas are required to get any sort of service.  So it seemed a good place to take the Kite and try it out in a weak signal area.

3G coverage in Linville, with external antenna.

4G coverage in Linville, with external antenna.

4GX coverage in Linville, with external antenna.

Sadly, I didn’t get as much time as I would have liked to perform these tests, and it would have been great to compare against a few others… but I was able to take some screenshots on the way up of the three phones, all on the same network (Telstra), using their internal antennas (and the small whip in the case of the Kite).  However, we got there in the afternoon, and there were clouds gathering, so we had to get to Moore.

In any case, Telstra seems to have pulled their socks up since those maps were updated… as I found I was getting reasonable coverage on the T83.  The Kite was in the car at the time, I didn’t want it getting damaged if I came off the bike or if the heavens opened up.

I did manage to take some screenshots on the three phones on the way up.

This is not that scientific, and a bit crude since I couldn’t take the screenshots at exactly the same moment.  Plus, we were travelling at 100km/hr for much of the run.  There was one point where we stopped for breakfast at Fernvale, I can’t recall exactly what time that was or whether I got a screenshot from all three phones at that time.

The T84 is the only phone out of the three that can do the 4GX 700MHz band.

Time ZTE T83 ZTE T84 iSquare Mobility Kite v1 Notes
2018-06-10T06:08:16 t83 at 2018-06-10T06:08:16 Leaving Brisbane
2018-06-10T06:09:24 kite at 2018-06-10T06:09:24
2018-06-10T06:09:33 t83 at 2018-06-10T06:09:33
2018-06-10T06:26:17 t83 at 2018-06-10T06:26:17
2018-06-10T06:26:25 kite at 2018-06-10T06:26:25
2018-06-10T07:30:27 t84 at 2018-06-10T07:30:27 A rare moment where the T84 beats the others.  My guess is this is a 4GX (700MHz) cell.
2018-06-10T07:30:34 kite at 2018-06-10T07:30:34
2018-06-10T07:30:39 t83 at 2018-06-10T07:30:39
2018-06-10T07:41:48 kite at 2018-06-10T07:41:48
2018-06-10T07:41:54 t84 at 2018-06-10T07:41:54 HSPA coverage… one of the few times we see the T84 drop back to 3G.
2018-06-10T07:42:01 t83 at 2018-06-10T07:42:01
2018-06-10T07:51:34 t83 at 2018-06-10T07:51:34 Patchy coverage at times en route to Moore.
2018-06-10T07:51:45 kite at 2018-06-10T07:51:45
2018-06-10T08:24:57 kite at 2018-06-10T08:24:57 For grins, trying out Optus coverage on the Kite at Moore.  There’s a tower at Benarkin, not sure if there’s one closer to Moore.
2018-06-10T08:25:39 kite at 2018-06-10T08:25:39
2018-06-10T08:54:28 t84 at 2018-06-10T08:54:28
2018-06-10T08:54:35 kite at 2018-06-10T08:54:35 En route to Benarkin, we lose contact with Telstra on all three devices.
2018-06-10T08:54:39 t83 at 2018-06-10T08:54:39
2018-06-10T09:35:14 kite at 2018-06-10T09:35:14 In Benarkin.
2018-06-10T09:35:22 t83 at 2018-06-10T09:35:22
2018-06-10T10:25:27 kite at 2018-06-10T10:25:27
2018-06-10T10:25:48 t83 at 2018-06-10T10:25:48

So what does the above show?  Well, for starters, it is apparent that the T83 gets left in the dust by both devices.  This is interesting as my T83 definitely was the more reliable on our last trip into the Snowy Mountains, regularly getting a signal in places where the T84 failed.

Two spots I’d love to take the Kite would be Dumboy Creek (4km outside Delungra on the Gwydir Highway) and Sawpit Creek (just outside Jindabyne), but both are a bit far for a day trip!  It’s unlikely I’ll be venturing that far south again this year.

On this trip up the Brisbane Valley though, I observed that when the signal got weak, the Kite was more willing to drop back to 3G, whereas the two ZTE phones hung onto that little scrap of 4G.  Yes, 4G might give clearer call quality and faster speeds in ideal conditions, but these conditions are not ideal, we’re in fringe coverage.

The 4G standards use much more dense forms of modulation (QPSK, 16-QAM or 64-QAM) than 3G (QPSK only) trading off spectral efficiency for signal-to-noise performance, thus lean more heavily on forward error correction to achieve communications in adverse conditions.  When a symbol is corrupted, more data is lost with these standards.  3G might be slower, but sometimes slow and steady wins the race, fast and flaky is a recipe of frustration.

A more scientific experiment, where we are stationary, and can let each device “settle” before taking a reading, would be worthwhile.  Without a doubt, the Kite runs rings around the T83.  The T84 is less clear: the T84 and the Kite both run the same chipset; the Qualcomm MSM8916.  The T83 runs the older MSM8930.

By rights, the T84 and Kite should perform nearly identical, with the Kite having the advantage of a high-gain whip antenna instead of a more conventional patch panel antenna.  The only edge the T84 has, is the 700MHz band, which isn’t that heavily deployed here in Australia right now.

The T83 and T84 can take an external antenna, but the socket is designed for cradle use and isn’t as rugged or durable as the SMA connector used on the Kite.  It’s soldered to the PCB, and when a cable is plugged in, it disconnects the internal antenna.

Thus damage to this connector can render these phones useless.  The SMA connector on the Kite however is a pigtail to an IPX socket inside … a readily available off-the-shelf (mail-order) part.  People may not like the whip sticking out though.

The Kite does ship with a patch antenna, which is about 75% efficient; so maybe 0dBi at best, however I think making the case another 10mm longer and incorporating the whip into the top of the phone so the antenna can tuck away when not needed, is a better plan.  It would not be hard to make the case accommodate it so it’s invisible and can fold out, or be replaced with a coax connection to an external antenna.

If there’s time, I’ll try to get some more conclusive tests done, but there’s no guarantees on that.

Jun 062018
 

Recently, a stoush erupted between NBN chief executive Bill Morrow and the gaming community over whether “gamers” were “causing” the congestion issues experienced on fixed-wireless broadband links.

The ABC published this chart, comparing the average transfer rate, of various games, to the average transfer rate seen watching various movies.  It’s an interesting chart, but I think it completely misses the point.

One thing that raw download speeds miss, is latency.

Multimedia is hard real-time, however unless you’re doing a two-way video or voice call, a few seconds of latency is not going to bother you. Your playback device can buffer several seconds worth of movie to feed to your video and sound devices and keep their buffers fed. No problem.

If those buffers aren’t kept topped up, you get break-up in your audio and the video “freezes” momentarily, loosing the illusion of animation. So long as the data is received over the Internet link, passed to the decoder to be converted to raw video frames and audio samples, and stuffed into the relevant buffers in time, it all runs smoothly. Pre-recorded material makes this dead easy (by comparison). Uni-directional live streams are a bit more tricky, but again you can put up with quite a bit of latency.

Radio stations often have about 300-500ms of latency … just listen to the echo effect when a caller rings up with a radio on in the background, if it were truly live, it would howl like a PA microphone!

It’s two-way traffic that’s the challenge.

Imagine if, when typing an email… it was 5 seconds before the letters you just typed showed up. Or if you moved the mouse, it took 3 seconds before it registered that you had moved. If someone were just observing the screen (unaware of when the keystrokes/mouse clicks had been entered), they’d think the user was drunk!

And yes, I have personally experienced such links… type something, then go wait 30 seconds before hitting the ENTER key, or if you spot a mistake, count up the number of backspaces or cursor movements you need to type, then wait for the cursor to reach that spot before you make your correction. It’s frustrating!

Now consider online gaming, where reaction time requirements are akin to driving a race car. One false move, and suddenly your opposition has shot you, or they’ve successfully dodged your virtual bullet.

Carrier pigeons carrying MicroSD cards (which reach 128GB capacity these days) could actually outperform NBN in many places for raw data throughput. However, if the results from the Bergen Linux User’s Group experiments are anything to go by, you can expect a latency measured in hours. (Their ping log shows the round-trip-time to be about 53 minutes in the best case.)

The movie stream will be sending many large packets at a mostly regular rate. The video game will be sending lots of tiny packets that Must Be Delivered Right Now!

I think it naïve to directly compare the two in the manner these graphs simply due to the nature of the types of traffic involved. Video/VoIP calling would be a better metric, since a 100ms delay in a telephone conversation will have both parties verbally tripping over each other.

Tele-medicine is touted as one of the up-and-comming technologies, but for a surgeon to remotely operate on a patient, they need that robotic arm to respond right now, not in 30 seconds time.  It may not be a lot of data to say “rotate 2°”, or “move forward 500µm”, but it needs to get there quickly, and the feedback from said movement arrive back quickly if the patient is going to live.

The sooner we stop ignoring this elephant in the room, the better off we’ll all be.

Jun 042018
 

So, recently there was a task at my work to review enabling gzip compression on our nginx HTTP servers to compress the traffic.

Now, in principle it seemed like a good idea, but having been exposed to the security world a little bit, I was familiar with some of the issues with this, notably, CRIME, BEAST and BREACH.  Of these, only BREACH is unmitigated at the browser end.

The suggested mitigations, in order of effectiveness are:

  1. Disabling HTTP compression
  2. Separating secrets from user input
  3. Randomizing secrets per request
  4. Masking secrets (effectively randomizing by XORing with a random secret per request)
  5. Protecting vulnerable pages with CSRF
  6. Length hiding (by adding random number of bytes to the responses)
  7. Rate-limiting the requests

Now, we’ve effectively being doing (1) by default… but (2), (3) and (4) make me wonder how protocols like OAuth2 are supposed to work.  That got me thinking about a little toy I was given for attending the 2011 linux.conf.au… it’s a YubiKey, one of the early model ones.  The way it operates is that Yubico’s servers, and your key, share a secret AES key (I think it’s AES-128), some static data, and a counter.  Each time you generate a one-time pad with the key, it increments its counter, encrypts the value with the static data, then encodes the output as a hexdump using a keyboard-agnostic encoding scheme to be “typed” into the computer.

Yubico receive this token, decrypt it, then compare the counter value.  If it checks out, and is greater than the existing counter value at their end, they accept it, and store that new counter value.

The same made me wonder if that could work for requests from a browser… that is, you agree on a shared secret over HTTPS, or using Diffie Hellman.  You synchronise counters (either using your new shared secret, or over HTTPS at the same time as you make the shared key), then from there on, each request to your API made by the browser, is then accompanied by a one-time pad, generated by encrypting the counter value and the static data and sending that in the HTTP headers.

There are a few libraries that do AES in the browser, such as JSAES (GPLv3) and aes-js (MIT).

This is going to be expensive to do, so a compromise might be to use this every N requests, where N is small enough that BREACH doesn’t have a sufficient number of requests from which it can derive a secret.  By the time it figures out that secret, the token is expired.  Or they could be bulk-generated at the browser end in the background so there’s a ready supply.

I haven’t gone through the full in’s and out’s of this, and I’m no security expert, but that’s just some initial thinking.

May 312018
 

So, recently I bit the bullet and decided to sign up for an account with AliExpress.

So far, what I’ve bought from there has been clothing (unbranded stuff, not counterfeit) … while there’s some very cheap electronics there, I’m leery about the quality of some of it, preferring instead to spend a little more to buy through a more reliable supplier.

Basically, it’s a supplier of last resort, if I can’t buy something anywhere else, I’ll look here.

So far the experience has been okay.  The sellers so far have been genuine, while the slow boat from China takes a while, it’s not that big a deal.

That said, it would appear the people who actually develop its back-end are a little clueless where it comes to matters on the Internet.

Naïve email address validation rules

Yes, they’re far from the first culprits, but it would seem perfectly compliant email addresses, such as foo+bar@gmail.com, are rejected as “invalid”.

News to you AliExpress, and to anyone else, You Can Put Plus Signs In Your Email Address!

Lots of SMTP servers and webmail providers support it, to quote Wikipedia:

Addresses of this form, using various separators between the base name and the tag, are supported by several email services, including Runbox (plus), Gmail (plus),[11] Yahoo! Mail Plus (hyphen),[12] Apple’s iCloud (plus), Outlook.com (plus),[13] ProtonMail (plus),[14] FastMail (plus and Subdomain Addressing),[15] MMDF (equals), Qmail and Courier Mail Server (hyphen).[16][17] Postfix allows configuring an arbitrary separator from the legal character set.[18]

You’ll note the ones that use other characters (e.g. MMDF, Yahoo, Qmail and Courier) are in the minority.  Postfix will let you pick nearly anything (within reason), all the others use the plus symbol.

Doing this means instead of using my regular email address, I can use user+secret@example.com — if I see a spoof email pretending to be from you sent to user@example.com, I know it is fake.  On the other hand, if I see someone else use user+secret@example.com, I know they got that email address from you.

Email validation is actually a lot more complex than most people realise… it’s gotten simpler with the advent of SMTP, but years ago …server1!server2!server3!me was legitimate in the days of UUCP.  During the transition, server1!server2!server3!user@somesmtpserver.example.com was not unheard of either.  Or maybe user%innnerhost@outerhost.net?  Again, within standards.

Protocol-relative URIs don’t work outside web browsers

This, I’ve reported to them before, but basically the crux of the issue is their message notification emails.  The following is a screenshot of an actual email received from AliExpress.

Now, it would not matter what the email client was.  In this case, it’s Thunderbird, but the same problem would exist for Eudora, Outlook, Windows Mail, Apple Mail, The Bat!, Pegasus Mail … or any other email client you care to name.  If it runs outside the browser, that URI is invalid.  Protocol-relative means you use the same protocol as the page the hyperlink exists on.

In this case, the “protocol” used to retrieve that “page” was imap; imap://msg.aliexpress.com is wrong.  So is pop3://msg.aliexpress.com.  The only place I see this working, is on webmail sites.

Clearly, someone needs a clue-by-four to realise that not everybody uses a web browser to browse email.

Weak password requirements

When I signed up, boy where they fussy about the password.  My standard passwords are gibberish with punctuation… something AliExpress did not like.  They do not allow anything except digits and letters, and you must choose between 6 and 20 characters.  Not even XKCD standards work here!

Again, they aren’t the only ones… Suncorp are another mob that come to mind (in fact, they’re even more “strict”, they only allow 8… this is for their Internet banking… in 2018).  Thankfully the one bank account I have Internet banking on, is a no-fee account that has bugger all cash in it… the one with my savings in it is a passbook account, and completely separate.  (To their credit though, they do allow + in an email address.  They at least got that right.)

I can understand the field having some limit… you don’t want to receive two blu-ray discs worth of “password” every time a user authenticates themselves… but geez… would it kill you to allow 50 characters?  Does your salted hashing algorithm (you are using salted hashes aren’t you?) really care what characters you use?  Should you be using it if it does?  Once hashed, the output is going to be a fixed width, ideal for a database, and Bobby Tables is going to be hard pushed to pick a password that will hash to “‘; drop table users; –“.

By requiting these silly rules, they’ve actually forced me to use a weaker password.  The passwords I would have used on each site, had I been given the opportunity to pick my own, would have featured a much richer choice of characters, and thus been harder to break.  Instead, you’ve hobbled your own security.  Go team!

Reporting website issues is more difficult than it needs to be

Reporting a website issue is neigh on impossible.  Hence the reason for this post.  Plenty is there if I want to pick a fight with a seller (I don’t), or if I think there’s an intellectual property issue (this isn’t).  I eventually did find a form, and maybe they’ll do something about it, but I’m not holding my breath.

Forget to whitelist a script, and you get sworn at, in Mandarin

This is a matter of “unhappy code paths” not receiving the attention that they need.  In fact, there are a few places where they haven’t really debugged their l10n support properly and so the untranslated Alibaba pops up.

Yeah, the way China is going with global domination, we might some day find ourselves having to brush up on our Mandarin, and maybe Cantonese too… but that day is not today.

Anyway, I think that more or less settles it for now.  I’ll probably find more to groan about, but I do need to get some sleep tonight and go to work tomorrow.

May 192018
 

Recently, a new project sprang up on the Hackaday.io site; it was for the KiteBoard, an open-source cellular development platform.  In a nutshell, this is a single-board-computer that embeds a full mobile system-on-chip and runs the Android operating system.  The project is seeking crowd funding for the second version of this platform.

With it, you can build smartphones (of course), tablets, tele-presence robots, or really, any project which can benefit from a beefy CPU with a built-in cellular modem.  It comes as a kit, which you then assemble yourself.  The level of difficulty in assembly is no greater than that of assembling a desktop PC: the circuit boards are pre-populated, you just need to connect them together.  In this version, some soldering of pushbuttons and wires is needed: all through-hole components.  No reflow ovens or solder paste is necessary here, an 8-year-old could do it.

The break-out board for the CPU card features in addition to connections for all the usual cellular phone signals (e.g. earpiece, microphone, button inputs) a GPIO header that follows the de-facto standard “Raspberry Pi” interface, allowing many Raspberry Pi “hats” to plug directly into this board.

That lends itself greatly to expandability.  Want a eInk or OLED notification display on the back?  A scrolling LED display?  A piano?  A games console?  Knock yourself out!  You, are the designer, you decide.  There are lots of options.

I for one, would consider an amateur radio transceiver, an external antenna socket and a beefier battery.  Presently, I get around with the ZTE T83 (“Telstra Dave”), which works okay, but as it runs an old version of Android (4.1), running newer applications on it is a problem.  I believe it could run something newer, but ZTE believe that their job was finished in 2013 when the first one rolled off the production line.

The box did not include a copy of the kernel sources or any link to where that could be obtained.  (GNU GPL v2 section 2b?  What’s that?)

The successor, the T84 is a little better, in fact it has pretty much the same hardware that’s in Kite, but it struggles in rural areas.  On a recent trip into the Snowy Mountains, my phone would be working fine, when my father’s T84 would report “no service available”.  Clearly, someone at Telstra/ZTE screwed up the firmware on it, and so it fails to switch networks correctly.  Without the sources, we are unable to fix that.  Even something as simple as replacing a battery is neigh on impossible, they’re built like bombs: not designed to be taken apart.

I have no desire to spend money on a company that puts out poorly supported rubbish running pirated operating system kernels.  The story is similar elsewhere, and most devices while better in specs and operating system, lack the external antenna connection that I desire in a phone.

Kite represents a breath of fresh air in that regard.  It is to smart phones, what the Raspberry Pi is to single board computers in general.  It’s not only designed to be taken apart, it’s shipped to you as parts.  Apparently with Kite v2, there’ll be schematics available, so you’ll be able to look-up the datasheets of respective components and be able to make informed decisions about part substitutions.  All antenna connections are socketed, so you can substitute at will.

While the OS isn’t going to be as open as one might like (mobile chipset manufacturers like their black boxes), it’s a BIG step in the right direction.  There’s more scope for supporting this platform long-term, than contemporary ones.

As far as actually using Kite, Shree Kumar was generous enough to organise the loan of a Kite for me to test with the Australian networks.  The phone takes up to two micro-SIMs (about 15mm×12mm); one on the daughter card (this is SIM 1) and one on the CPU card (SIM 2).

For the sake of testing, I figured I’d try it out with the two major networks, Telstra and Optus.  As it happens, my Telstra SIM is too big (they call it a “full-size” SIM now; I remember full-size SIMs being credit-card sized), so rather than chopping up my existing SIM or getting it transferred, I bought and activated a prepaid service.  I also bought a SIM for Optus.  I bought $10 credit for each.

As it happens, the Optus one came with data, the Telstra did not.  No big deal in this case.  The phone does have a limitation in that it will talk to one 3G/4G network and one GSM (2G) network at a time.  Given both networks I chose have abandoned 2G, that pretty much means the dual-SIM functionality on this model is severely hobbled.  That said, either SIM can operate in 3G mode, and so it’s simple enough to switch one SIM into 2G mode then activate the other in 3G/4G mode.  So far, the Kite has spent most of its time on Optus.

Evidently Vodaphone still have a 2G network… at least the Kite does see one 2G cell operated by them.  Long term, this is a problem that all dual-SIM phone chipset makers will have to deal with, a future Kite may well be able to do 3G simultaneously on both SIMs, but for me, this is not a show-stopper.

I’ve put together this review of the Kite.  It’s rare for me to be in front of a camera instead of behind it, and yes, the editing is very rough.  If there is time (there won’t be this weekend) I hope to take the phone out to a rural area and try it out with the more distant networks, but so far it seems happy enough to switch to 3G when I get home, and use 4G when I’m at work, so this I see as a promising sign.

The KickStarter is lagging behind quite a way in the funding goal, but alternate options are being considered for getting this project off-the-ground.  Here’s hoping that the project does get up, and that we get to see Kite v2 being developed and made for real, as I think the mobile phone industry really does need a viable open competitor.

Mar 192018
 

So, on Friday, I had a job to update some documentation.  Specifically, I had to update the code examples on a Confluence document.

No problem… or so I thought.  The issue I faced was that it seems the Confluence application is getting too clever for its own good.  Honestly, I’d be happier with a plain textarea which took some Wiki syntax such as Markdown… or heck… plain HTML!  I use WordPress on this blog here, and while the editor here isn’t bad, I’m thankful that going to the source editor is just a click away, as there’s some things the WYSIWYG editor can’t do well (inline code), or even at all (tables).

The editor in Confluence is much less polished.  Navigating with the arrow keys is an unpredictable experience, sometimes it moves by single lines, sometimes it jumps a page.  Sometimes, starting several lines deep in a code block, a single up-arrow will move you to the line above, sometimes it moves you to some line in a paragraph above the code block.  It’s an exercise in frustration.

Fine, I thought, I’ll just copy and paste the code into qvim.  Highlight… copy… paste… ohh brilliant, it’s now all stuffed onto one line!  Thankfully what I was editing, was JSON, so it’s real easy to re-format that, vim makes it real easy to pipe the buffer contents through an arbitrary external program such as python -m json.tool.  This lacked the flexibility to auto-format the JSON the way the code examples were formatted though, so I made a work-alike that made use of Python’s OrderedDict to sort the keys a bit more logically, and told json.dump to indent the code with 2-space indentation (this is how the existing examples were formatted).

Having done this, I thought I’d make mention to Atlassian about the issues with their editor.  I hit the Feedback link up the top of the page.  I pointed out the issues I was having.  In closing I also pointed out how sluggish their system was.  The desktop PC at work is a 8-core AMD Ryzen 7 1700 with 16GB of DDR4.  Not a slow machine.  Maybe it’s rose-coloured glasses, but I recall having a smoother editing experience with Microsoft Word for Windows 6.0 on my 33MHz 486/DX, which sported a whopping 8MB RAM.  Hot stuff back in 1994.  My present desktop does fine with LibreOffice, and this WordPress blog works fine in it, so I know it’s not my browser or hardware.  Yet Confluence struggles, on a PC that has 8 times the CPU cores, each running at nearly 10 times the clock speed, and with 2048 times the amount of RAM to boot.

I composed my feedback and sent it Friday afternoon.  I left the browser window open while I submitted the feedback, and went home.  This morning, I get in, enter my password to unlock the workstation, and see this:

Atlassian feedback … *still* sending after a whole week-end!

Yep, about 2kB of plain text has taken more than 50 hours to make its way from my desktop to their back-end servers.  Did a feral cat interrupt their RFC-1149 based Internet link?

Feb 132018
 

So, over the last few years we’ve seen a big shift in the way websites operate.

Once upon a time, JavaScript was a nice-to-have, and you as a web developer better be prepared for it to not be functional; the DOM was non-existent, and we were ooohing and ahhing over the de facto standard in Internet multimedia; MacroMedia Flash.  The engine we now call WebKit was still a primitive and quite basic renderer called KHTML in a little-known browser called Konqueror.  Mozilla didn’t exist as an open-source project yet; it was Netscape and Microsoft duelling it out together.

Back then, XMLHTTPRequest was so new, it wasn’t a standard yet; Microsoft had implemented the idea as an ActiveX control in IE5, no one else had it yet.  So if you wanted to update a page, you had to re-load the whole lot and render it server-side.  We had just shaken off our FONT tags for CSS (thank god!), but if you wanted to make an image change as the mouse cursor hovered over it, you still needed those onmouseover/onmouseout event handlers to swap the image.  Ohh, and scalable graphics?  Forget it.  Render as a GIF or JPEG and hope you picked the resolution right.

And bear in mind, the expectation was that, a user running an 800×600 pixel screen resolution, and connected via a 28.8kbps dial-up modem, should be able to load your page up within about 30 seconds, and navigate without needing to resort to horizontal scroll bars.  That meant images had to be compressed to be no bigger than 30kB.

That was 17 years ago.  Man I feel old!

This gets me thinking… today, the expectation is that your Internet connection is at least 256kbps.  Why then do websites take so long to load?

It seems our modern web designers have forgotten the art of how to pack down a website to minimise the amount of data needed to be transmitted so that the page is functional.  In this modern age of “pretty” web design, we’ve forgotten how to make a page practical.

Today, if you want to show an icon on a page, and have it fill the entire browser window, you can fire up Inkscape or Adobe Illustrator, let the creative juices flow and voilá, out pops a scalable vector graphic, which can be dropped straight into your HTML.  Turn on gzip compression on the web server, and that graphic will be on that 28.8kbps user’s screen in under 3 seconds, and can still be as big as they want.

If you want to make a page interactive, there’s no need to reload the entire page; XMLHTTPRequest is now a W3C standard, and implemented in all the major browsers.  Websockets means an end to any kind of polling; you can get updates as they happen.

It seems silly, but in spite of all the advancements, website page loads are not getting faster, they’re getting slower.  The “everybody has broadband” and “everybody has full-HD screens” argument is being used as an excuse for bloat and sloppy design practices.

More than once I’ve had to point someone to the horizontal scroll bar because the web designer failed to test their website at the rather common 1366×768 screen resolution of a typical laptop.  If I had a dollar for every time that’s happened in the last 12 months, I’d be able to buy the offending companies out and sack the web designers responsible!

One of the most annoying, from a security perspective, is the proliferation of “content distribution networks”.  It seems they’ve realised these big bulky blobs of JavaScript take a long time to load even on fast links.  So, what do the bright sparks do?  “I know… instead of loading it from one server, I’ll put it on 10 and increase my upload capacity 10-fold!”  Yes, they might have 1Gbps on each host.  1Gbps × 10 = 10Gbps, so the page will load at 10Gbps, right?

Cue sad tuba sound effect.

At my workplace, we have a 20Mbps Ethernet (not ADSL[2], fibre or cable; Ethernet) link to the Internet.  On that link, I’ve been watching the web get slower and slower… and I do not think our ISP is completely to blame, as I see the same issue at home too.  One where we feel the pain a lot, is Atlassian’s system, particularly Jira and Confluence.  To give you how bad they drink the CDN cool-aid, check out the number of sites I have to whitelist in order to get the page functional:

Atlassian’s JIRA… failing in spite of a crapton of scripts being loaded.

That’s 17 different hosts my web browser must make contact with, and download content from, before the page will function.  17 separate HTTP connections, which must fight with all other IP traffic on that 20Mbps Ethernet link for bandwidth.  20Mbps is the maximum that any one connection will do, and I can guarantee it will not reach even half that!

Interestingly, despite allowing all those scripts to load, they still failed to come up with the goods after a pregnant pause.  So the extra trashing of the link was for naught.  Then there’s the security implications.

At least 3 of those, are pages that Atlassian do not control.  If someone compromised ravenjs.com for example; they could inject any JavaScript they want on the JIRA site, and take control of a user’s account.  Atlassian are relying on these third partys’ promises and security practices, to ensure their site stays secure, and stays in their (third party’s) control.  Suppose someone forgets to renew the domain subscription, the result could be highly embarrassing!

So, I’m left wondering what they teach these days.  For a multitude of reasons, sites should be blazingly quick to load, partly because modern techniques ought to permit vastly improved efficiency of content representation and delivery; and that network link speeds are steadily improving.  However it seems the reverse is true… why are we failing so badly?

Jan 302018
 

So, today I had a problem… I needed to solve a race condition in a test case for my workplace’s WideSky system.  The test case was meant to ensure that, if the AMQP broker crashed or was restarted, it would re-connect and resume operations as quickly as possible.

On my desktop (an 8-core AMD Rysen 7), the test case always passed.  On the CI server (a VM running on a dual-core Core i3), it failed.  I figured the desktop here was running too quickly for me to see the problem.  I needed a machine that ran more like the CI server to see the problem.

Looking around, I couldn’t see any way to reliably slow down QEMU, KVM or VirtualBox… but I do remember one old project from the mid-late 90s that could: Bochs.

Bochs in action… emulating a P4 Prescott on a Rysen 7

Turns out, far from what it could do back in 1998 when it was strictly a 386 emulator (and a slow one at that!) it now has AMD64 emulation capabilities.  Thus, I can run the software stack inside this VM, and have it throttle the CPU speed down so that hopefully, the problem arises.  The first problem I needed to solve was trying to get the network running.  We have a PXE boot server which can serve up Ubuntu, no problem.  I just needed to bridge the Bochs VM onto the network somehow.

I already have bridge interfaces configured on my two physical network interfaces, and these work great with KVM.  Sadly, Bochs is rather primitive in what it supports… tap-mode networking just did not work, it complained that tap0 was not “running” even if created beforehand by iproute2, but I did find I could bind it directly to one of the enslaved network interfaces (enp36s0.200; yes, a VLAN interface).

e1000 worked for network booting, but then Ubuntu couldn’t retrieve an IP address for whatever reason. ne2k is working fine, and presently, I have the VM installing.  To make it network bootable, you need a boot ROM image, which you can download from the iPXE rom-o-matic service.  The magic PCI IDs you need are 10ec 8029 for ne2k, or (if it gets fixed) 8086 10de for e1000.

The following is my Bochs config file:

# configuration file generated by Bochs
plugin_ctrl: unmapped=1, biosdev=1, speaker=1, extfpuirq=1, parallel=1, serial=1, gameport=1, ne2k=1
config_interface: textconfig
display_library: x
debug: action=report
memory: host=2048, guest=2048
romimage: file="/usr/share/bochs/BIOS-bochs-latest", address=0x0, options=none
vgaromimage: file="/usr/share/bochs/VGABIOS-lgpl-latest"
boot: disk, network
floppy_bootsig_check: disabled=0
# no floppya
# no floppyb
ata0: enabled=1, ioaddr1=0x1f0, ioaddr2=0x3f0, irq=14
ata0-master: type=disk, path="/tmp/wstest.raw", mode=flat, cylinders=0, heads=0, spt=0, model="Generic 1234", biosdetect=auto, translation=auto
ata0-slave: type=none
ata1: enabled=1, ioaddr1=0x170, ioaddr2=0x370, irq=15
ata1-master: type=none
ata1-slave: type=none
ata2: enabled=0
ata3: enabled=0
optromimage1: file=none
optromimage2: file=none
optromimage3: file=none
optromimage4: file=none
optramimage1: file=none
optramimage2: file=none
optramimage3: file=none
optramimage4: file=none
pci: enabled=1, chipset=i440fx, slot1=ne2k, slot2=cirrus
vga: extension=cirrus, update_freq=5, realtime=1
cpu: count=1:1:1, ips=40000000, quantum=16, model=p4_prescott_celeron_336, reset_on_triple_fault=1, cpuid_limit_winnt=0, ignore_bad_msrs=1, mwait_is_nop=0
print_timestamps: enabled=0
port_e9_hack: enabled=0
private_colormap: enabled=0
clock: sync=none, time0=local, rtc_sync=0
# no cmosimage
# no loader
log: -
logprefix: %t%e%d
debug: action=ignore
info: action=report
error: action=report
panic: action=ask
keyboard: type=mf, serial_delay=250, paste_delay=100000, user_shortcut=none
mouse: type=ps2, enabled=0, toggle=ctrl+mbutton
speaker: enabled=1, mode=system
parport1: enabled=1, file=none
parport2: enabled=0
com1: enabled=1, mode=null
com2: enabled=0
com3: enabled=0
com4: enabled=0
ne2k: enabled=1, mac=fe:fd:de:ad:be:ef, ethmod=linux, ethdev=enp36s0.200, script=/bin/true, bootrom="/tmp/10ec8029.rom"

Create your hard drive image using qemu-img, then run bochs -f yourfile.cfg and it should, hopefully, work.

Jan 132018
 

Part of my day job involves being the technical contact for their website, which means we get lots of offers from people offering to put us on the “first page of Google”.

Hmm, last time I checked, the first page of Google was, strangely, Google.  Somehow, I don’t think they outsource their SEO strategy to get there… they wrote the bloody code!

These emails go straight to Spamcop generally… and they send nastygrams to the people hosting the email servers they used.  In some cases, I’ve taken the extraordinary step of blocking frequently abused hosts.

# Block Centrilogic and SmartMailer because they don't act on spam reports.
-A INPUT -s 173.240.14.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 199.43.203.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
# Block OVH because they don't act on spam reports.
# List taken from https://mxtoolbox.com/SuperTool.aspx?action=asn%3aAS16276&run=toolpage
-A INPUT -s 5.39.0.0/17 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 5.135.0.0/16 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 5.196.0.0/16 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.7.244.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.18.128.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.18.136.0/21 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.18.172.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.20.110.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.21.41.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.24.8.0/21 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.26.94.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.29.224.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.30.208.0/21 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.33.96.0/21 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
…

That is not an exhaustive list.  Sorry to people who use OVH for hosting and were trying to contact VRT/CETA legitimately, but OVH have shown themselves to be grossly incompetent with regard to management of network abuse.  Centrilogic/SmartMailer are more recent additions.

Of course, they keep trying, and thankfully, it takes longer for them to write the email than it does for me to deal with it. This doesn’t stop them claiming little gems like this:

Note: We are not spammers and are against spamming of any kind. If you are not interested then you can reply with a simple “NO”.

Errm, hate to disagree (actually no, in this case, I love disagreement)… but a few points:

  1. Your sending me an unsolicited content…
  2. … without my consent… (no listing in domain registration or scraping from a website is not consent)
  3. … that is advertising a paid-for service or otherwise something you’re hoping to make money from…
  4. … by electronic messaging.

That by definition is an Unsolicited Commercial Email… aka SPAM.  If you claim to be an Australian business, you better have a look at this.  If your ISP is complaining that you are abusing their services by sending spam, then perhaps you need to realise the people you are contacting are not interested!  You have your NO.

Nov 062017
 

So, I’m doing some development on a Cortex M3-based device with access to only one serial port, and that serial port is doing double-duty as serial console and polling a Modbus energy meter.  How do I get log messages out?

My code actually implements the stubs to direct stdout and stderr transparently to the serial port, however this has to go to /dev/null when the Modbus port is in use.  That said, _write_r still gets called, in my code, it is possible to set a breakpoint inside the _write_r function when traffic is identified for the console.

As it happens, gdb can be told to not only break there, but to perform a series of actions.  In my case serial.c:659 is the file and line number inside an if branch that handles the console code.  Setting up gdb to print this data out requires the following statements:

(gdb) break serial.c:659
(gdb) commands
Type commands for breakpoint(s) 3, one per line.
End with a line saying just "end".
>set ((char*)buf)[cnt] = 0
>print (char*)buf
>continue
>end
(gdb) c

The result:

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=78) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$51 = 0x200068c0 "/home/stuartl/vrt/projects/widesky/hub/hal/demo/main.c:226 Registration sent\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=46) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$52 = 0x200068c0 "Received NTP time is Mon Nov  6 04:13:41 2017\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=2) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$53 = 0x200068c0 "\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=89) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$54 = 0x200068c0 "/home/stuartl/vrt/projects/widesky/hub/hal/demo/main.c:115 Registration timeout: 30 sec\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=83) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$55 = 0x200068c0 "/home/stuartl/vrt/projects/widesky/hub/hal/demo/main.c:130 Select source address:\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=53) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$56 = 0x200068c0 " ? fdde:ad00:beef:0:0:ff:fe00:a400 Pref=Y Valid=Y\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=15) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$57 = 0x200068c0 " ? Selected\r\n"

---Type  to continue, or q  to quit---
Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=78) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$58 = 0x200068c0 "/home/stuartl/vrt/projects/widesky/hub/hal/demo/main.c:226 Registration sent\r\n"

Not as nice as having a dedicated port, but better than nothing.