Feb 262018
 

So, after a longish wait… my laptop finally coughed up an image with a C/C++ compiler and almost all the bits necessary to make Gentoo Portage tick.

Almost everything… wget built, but it segfaults on start-up.  No matter, it seems curl works.  We do have an issue though: Portage no longer supports customising the downloader like it used to, or at least I couldn’t see how to do it, it used to be settings in make.conf.

Thankfully, I know shell scripts, and can make my own wget using the working curl:

bash-4.4# cat > /usr/bin/wget
#!/bin/bash

OUT=
while [ $# -gt 0 ]; do
    case "$1" in
        -O) OUT="$2"; shift;;
        -t) shift;;
        -T) shift;;
        --passive-ftp) : ;;
        *) break ;;
    esac
    shift
done

set -ex
curl --progress-bar -o "${OUT}" "$1"

Okay, it’s a little (a lot) braindead, but it beats downloading the lot by hand!

I was able to get Gentoo installed by hand using these instructions.  I have an old 1TB HDD plugged into a USB dock, formatted with a 10GB swap partition and the rest btrfs.  Sure, it’s only USB 2.0, but I’d sooner just put up with some CPU overhead than wear out my eMMC.

Next step; ROOT=/tmp/seed emerge -ev system

Feb 132018
 

So, over the last few years we’ve seen a big shift in the way websites operate.

Once upon a time, JavaScript was a nice-to-have, and you as a web developer better be prepared for it to not be functional; the DOM was non-existent, and we were ooohing and ahhing over the de facto standard in Internet multimedia; MacroMedia Flash.  The engine we now call WebKit was still a primitive and quite basic renderer called KHTML in a little-known browser called Konqueror.  Mozilla didn’t exist as an open-source project yet; it was Netscape and Microsoft duelling it out together.

Back then, XMLHTTPRequest was so new, it wasn’t a standard yet; Microsoft had implemented the idea as an ActiveX control in IE5, no one else had it yet.  So if you wanted to update a page, you had to re-load the whole lot and render it server-side.  We had just shaken off our FONT tags for CSS (thank god!), but if you wanted to make an image change as the mouse cursor hovered over it, you still needed those onmouseover/onmouseout event handlers to swap the image.  Ohh, and scalable graphics?  Forget it.  Render as a GIF or JPEG and hope you picked the resolution right.

And bear in mind, the expectation was that, a user running an 800×600 pixel screen resolution, and connected via a 28.8kbps dial-up modem, should be able to load your page up within about 30 seconds, and navigate without needing to resort to horizontal scroll bars.  That meant images had to be compressed to be no bigger than 30kB.

That was 17 years ago.  Man I feel old!

This gets me thinking… today, the expectation is that your Internet connection is at least 256kbps.  Why then do websites take so long to load?

It seems our modern web designers have forgotten the art of how to pack down a website to minimise the amount of data needed to be transmitted so that the page is functional.  In this modern age of “pretty” web design, we’ve forgotten how to make a page practical.

Today, if you want to show an icon on a page, and have it fill the entire browser window, you can fire up Inkscape or Adobe Illustrator, let the creative juices flow and voilá, out pops a scalable vector graphic, which can be dropped straight into your HTML.  Turn on gzip compression on the web server, and that graphic will be on that 28.8kbps user’s screen in under 3 seconds, and can still be as big as they want.

If you want to make a page interactive, there’s no need to reload the entire page; XMLHTTPRequest is now a W3C standard, and implemented in all the major browsers.  Websockets means an end to any kind of polling; you can get updates as they happen.

It seems silly, but in spite of all the advancements, website page loads are not getting faster, they’re getting slower.  The “everybody has broadband” and “everybody has full-HD screens” argument is being used as an excuse for bloat and sloppy design practices.

More than once I’ve had to point someone to the horizontal scroll bar because the web designer failed to test their website at the rather common 1366×768 screen resolution of a typical laptop.  If I had a dollar for every time that’s happened in the last 12 months, I’d be able to buy the offending companies out and sack the web designers responsible!

One of the most annoying, from a security perspective, is the proliferation of “content distribution networks”.  It seems they’ve realised these big bulky blobs of JavaScript take a long time to load even on fast links.  So, what do the bright sparks do?  “I know… instead of loading it from one server, I’ll put it on 10 and increase my upload capacity 10-fold!”  Yes, they might have 1Gbps on each host.  1Gbps × 10 = 10Gbps, so the page will load at 10Gbps, right?

Cue sad tuba sound effect.

At my workplace, we have a 20Mbps Ethernet (not ADSL[2], fibre or cable; Ethernet) link to the Internet.  On that link, I’ve been watching the web get slower and slower… and I do not think our ISP is completely to blame, as I see the same issue at home too.  One where we feel the pain a lot, is Atlassian’s system, particularly Jira and Confluence.  To give you how bad they drink the CDN cool-aid, check out the number of sites I have to whitelist in order to get the page functional:

Atlassian’s JIRA… failing in spite of a crapton of scripts being loaded.

That’s 17 different hosts my web browser must make contact with, and download content from, before the page will function.  17 separate HTTP connections, which must fight with all other IP traffic on that 20Mbps Ethernet link for bandwidth.  20Mbps is the maximum that any one connection will do, and I can guarantee it will not reach even half that!

Interestingly, despite allowing all those scripts to load, they still failed to come up with the goods after a pregnant pause.  So the extra trashing of the link was for naught.  Then there’s the security implications.

At least 3 of those, are pages that Atlassian do not control.  If someone compromised ravenjs.com for example; they could inject any JavaScript they want on the JIRA site, and take control of a user’s account.  Atlassian are relying on these third partys’ promises and security practices, to ensure their site stays secure, and stays in their (third party’s) control.  Suppose someone forgets to renew the domain subscription, the result could be highly embarrassing!

So, I’m left wondering what they teach these days.  For a multitude of reasons, sites should be blazingly quick to load, partly because modern techniques ought to permit vastly improved efficiency of content representation and delivery; and that network link speeds are steadily improving.  However it seems the reverse is true… why are we failing so badly?