March 2023

Generating ball tickets/programmes using LaTeX

My father does a lot of Scottish Country dancing, he was a treasurer for the Clan MacKenzie association for quite a while, and a president there for about 10 years too. He was given a task for making some ball tickets, but each one being uniquely numbered.

After hearing him swear at LibreOffice for a bit, then at Avery’s label making software, I decided to take matters into my own hands.

First step was to come up with a template. The programs were to be A6-size booklets; made up of A5 pages folded in half. For ease of manufacture, they would be printed two to a page on A4 pages.

The first step was to come up with the template that would serve as the outer and inner pages. The outer page would have a placeholder that we’d substitute.

The outer pages of the programme/ticket booklet… yes there is a typo in the last line of the “back” page.
\usepackage[top=1cm, left=0cm, right=0cm, bottom=1cm]{geometry}


We thank you for your company today\linebreak
and helping to celebrate 50 years of friendship\linebreak
fun and learning in the Redlands.

May the road rise to greet you,\linebreak
may the wind always be at your back,\linebreak
may the sun shine warm upon your face,\linebreak
the rains fall soft upon your fields\linebreak
and until we meet again,\linebreak
may God gold you in the palm of his hand.


1973 - 2023\linebreak
Saturday 20th May 2023\linebreak
1.00pm for 1.30pm - 5pm\linebreak
Redlands Memorial Hall\linebreak
South Street\linebreak

Live Music by Emma Nixon \& Iain Mckenzie\linebreak
Black Bear Duo


Cost \$25 per person, non-dancer \$15\linebreak
\textbf{Ticket No \${NUM}}

The inner pages were the same for all booklets, so we just came up with one file that was used for all. I won’t put the code here, but suffice to say, it was similar to the above.

The inner pages, no placeholders needed here.

So we had two files; ticket-outer.tex and ticket-inner.tex. What next? Well, we needed to make 100 versions of ticket-outer.tex, each with a different number substituted for $NUM, and rendered as PDF. Similarly, we needed the inner pages rendered as a PDF (which we can do just once, since they’re all the same).


set -ex

pdflatex ticket-inner.tex
for n in $( seq 1 ${NUM_TICKETS} ); do
	sed -e 's:\\\${NUM}:'${n}':' \
            < ticket-outer.tex \
            > ticket-outer-${n}.tex
	pdflatex ticket-outer-${n}.tex

This gives us a single ticket-outer.pdf, and 100 different ticket-inner-NN.pdf files that look like this:

A ticket outer pages document with substituted placeholder

Now, we just need to put everything together. The final document should have no margins, and should just import the relevant PDF files in-place. So naturally, we just script it; this time stepping every 2 tickets, so we can assemble the A4 PDF document with our A5 tickets: outer pages of the odd-numbered ticket, outer pages of the even-numbered ticket, followed by two copies of the inner pages. Repeat for all tickets. We also need to ensure that initial paragraph lines are not indented, so setting \parindent solves that.

This is the rest of my quick-and-dirty shell script:

cat > tickets.tex <<EOF
\usepackage[top=0cm, left=0cm, right=0cm, bottom=0cm]{geometry}
for n in $( seq 1 2 ${NUM_TICKETS} ); do
	m=$(( ${n} + 1 ))
	cat >> tickets.tex <<EOF
cat >> tickets.tex <<EOF
pdflatex tickets.tex

The result is a 100-page PDF, which when printed double-sided, will yield a stack of tickets that are uniquely numbered and serve as programmes.

Mastodon experiment: a few months in

A little while back I decided to try out Mastodon, deploying my own instance running as a VM on my own hardware. This was primarily done to act as a testing ground for experimenting with integrating with it, but also as a means of keeping up with the news.

The latter is particularly important, as I no longer have the radio on all the time. I might hear a news item in the morning, but after the radio switches off, I’m unlikely to turn it back on until the next working day. A lot of news outlets moved to Twitter over the past decade, but with that site in its death throws, the ActivityPub ecosystem is looking like a safer bet.

Not many outlets are officially using this newer system yet. There are a few outlets that do publish directly to Mastodon/ActivityPub, examples being Rolling Stone, The Markup (who run their own instance), STAT, The Conversation AU/NZ and OSNews. Some outlets aren’t officially posting on ActivityPub, but are nonetheless visible via bridges from RSS (e.g. Ars Technica) and others are proxies of these outlets’ Twitter accounts (e.g. Reuters, Al Jazeera, The Guardian, Slashdot). Others are there, but it’s not clear how the material is being mirrored or if they’re official.

There’s also a decent dose of satire if you want it, including satirical news outlets The Chaser and The Shovel, and cartoonists such as Christopher Downes, David Rowe, Fiona Katauskas, Jon Kudelka, David Pope, Cathy Wilcox and Glen Le Lievre.

As you can gather, a big chunk of who I follow is actually news outlets, or humour. There are a few people on my “follow” list whom are known for posting various humour pieces from elsewhere, and I often “boost” (re-post) their content.

Meta (who run Facebook) have made noises they might join in with their own Twitter-clone in the ActivityPub fediverse. I wouldn’t mind this so much — the alternatives to them doing this is: (1) the rest of us needing dozens of social media accounts to keep in touch with everybody, (2) relying on the good will of some mega-site to connect us all, or (3) forgoing being in touch altogether.

I tried option (1) in the early part of this century, and frankly I’m over it. Elon Musk dreams of Twitter becoming option (2) but I think the chances of this happening are buckleys and none. (3) is not realistic, we’re social beings.

Some of these instances will be ad supported, and I guess that’s a compromise we may have to live with. Servers need electricity and Internet to run, and these are not free. A bigger cost to running big social networks is actually managing the meat-ware side of the network — moderation, legal teams to decide how moderation should be applied, handling take-down notices… etc.

ActivityPub actually supports flagging the content so the post is not “listed” (indexed by instances’ search engines), private posts (cannot be boosted, visible to followers only), even restricting to just those mentioned specifically. I guess there’s room for one more: “non-commercial use only” — commercial instances could then decide to they forgo the advertising on that post, or do they filter the post.

ActivityPub posting privacy settings on Mastodon

I did hear rumblings that the EU was likely to pass some laws requiring a certain level of interoperability between social networks, which ActivityPub could in fact be the basis of.

Some worry about another Eternal September moment — a repeat of the time when AOL disgorged its gaggle of novice Internet users on an unsuspecting Usenet system. Usenet users prior to AOL opening up in 1993 only had to deal with similar shenanigans once a year around September when each new batch of first year uni students would receive their Internet access accounts.

I’m not sure linking of a site like Facebook or Tumblr (who have also mentioned joining the Fediverse) is all that big a deal — Mastodon lets you block whole domains if you so choose, and who says everybody on a certain site is going to cause trouble?

Email is a federated system, always has been, and while participation as a small player is more complicated than it used to be, it is still doable. Big players like Microsoft and Google haven’t killed off email (even with the former doing their best to do so with sub-par email clients and servers). Yes, we have a bigger spam problem than we had back in the 90s, but keeping the signal-to-noise ratio up to useful levels is not impossible, even for mere mortals.

We do have to be mindful of the embrace-extend-break game that big business like to play with open protocols, I think Mastodon gGmbH’s status as a not-for-profit and a reference implementation should help here.

I’d rather throw my support behind a system that can allow us to all interoperate, and managing the misbehaviour that may arise on a case-by-case basis, is a better solution than us developing our own little private islands. The info-sec press seem to have been quick to jump ship from Twitter to Mastodon. IT press is taking a little longer, but there’s a small but growing group. I think the journalism world is going to be key to making this work and ensuring there’s good-quality content to drown out the low-quality noise. If big players like Meta joining in help push this along, I think this is worth encouraging.

A crude attempt at memory management

The other day I had a bit of a challenge to deal with. My workplace makes embedded data collection devices which are built around the Texas Instruments CC2538 SoC (internal photos visible here) and run OpenThread. To date, everything we’ve made has been an externally-powered device, running off either DC power (9-30V) or mains (120/240V 50/60Hz AC). CC2592 range extender support was added to OpenThread for this device.

The CC2538, although very light on RAM (32KiB), gets the job done with some constraints. Necessity threw us a curve-ball the other day, we wanted a device that ran off a battery. That meant going into sleep mode periodically, deep sleep! The CC2538 has a number of operating modes:

  1. running mode (pretty much everything turned on)
  2. light sleep mode (clocks, CPU and power stays on, but we pause a few peripherals)
  3. deep sleep mode — this comes in four flavours
    • PM0: Much like light-sleep, but we’ve got the option to pause clocks to more peripherals
    • PM1: PM0, plus we halt the main system clock (32MHz crystal or 16MHz RC), halting the CPU
    • PM2: PM1 plus we power down the bottom 16KiB of RAM and some other internal peripherals
    • PM3: PM2 plus we turn off the 32kHz crystal used by the sleep timer and watchdog.

We wanted PM2, which meant while we could use the bottom 16KiB of RAM during run-time, the moment we went to sleep, we had to forget about whatever was kept in that bottom 16KiB RAM — since without power it would lose its state anyway.

The challenge

Managing RAM in a device like this is always a challenge. malloc() is generally frowned upon, however in some cases it’s a necessary evil. OpenThread internally uses mbedTLS and that, relies on having a heap. It can use one implemented by OpenThread, or one provided by you. Our code also uses malloc for some things, notably short-term tasks like downloading a new configuration file or for buffering serial traffic.

The big challenge is that OpenThread itself uses a little over 9KiB RAM. We have a 4KiB stack. We’ve got under 3KiB left. That’s bare-bones OpenThread. If you want JOINER support, for joining a mesh network, that pulls in DTLS, which by default, will tell OpenThread to static-allocate a 6KiB buffer.

9KiB becomes about 15KiB; plus the stack, that’s 19KiB. This is bigger than 16KiB — the linker gives up.

Using heap memory

There is a work-around that gets things linking; you can build OpenThread with the option OPENTHREAD_CONFIG_HEAP_EXTERNAL_ENABLE — if you set this to 1, OpenThread forgoes its own heap and just uses malloc / free instead, implemented by your toolchain.

OpenThread builds and links in 16KiB RAM, hooray… but then you try joining, and; NoBufs is the response. We’re out of RAM. Moving things to the heap just kicked the can down the road, we still need that 6KiB, but we only have under 3KiB to give it. Not enough.

We have a problem in that, the toolchain we use, is built on newlib, and while it implements malloc / free / realloc; it does so with a primitive called _sbrk(). We define a pointer initialised up the top of our .bss, and whenever malloc needs more memory for the heap, it calls _sbrk(N); we grab the value of our pointer, add N to it, and return the old value. Easy.

Except… we don’t just have one memory pool now, we have two. One of which, we cannot use all the time. OpenThread, via mbedTLS also winds up calling on malloc() very early in the initialisation (as early as the otInstanceInitSingle() call to initialise OpenThread). We need that block of RAM to wind up in the upper 16KiB that stays powered on — so we can’t start at address 0x2000:0000 and just skip over .data/.bss when we run out.

malloc() will also get mighty confused if we suddenly hand it an address that’s lower than the one we handed out previously. We can’t go backwards.

I looked at replacing malloc() with a dual-pool-aware version, but newlib is hard-coded in a few places to use its own malloc() and not a third-party one. picolibc might let us swap it out, but getting that integrated looked like a lot of work.

So we’re stuck with newlib‘s malloc() for better or worse.

The hybrid approach

One option, we can’t control what malloc the newlib functions use. So use newlib‘s malloc with _sbrk() to manage the upper heap. Wrap that malloc with our own creation that we pass to OpenThread: we implement otPlatCAlloc and otPlatFree — which are essentially, calloc and free wrappers.

The strategy is simple; first try the normal calloc, if that returns NULL, then use our own.

Re-purposing an existing allocator

The first rule of software engineering, don’t write code you don’t have to. So naturally I went looking for options.

Page upon page of “No man don’t do it!!!”

jemalloc looked promising at first, it is the FreeBSD malloc(), but that there, lies a problem — it’s a pretty complicated piece of code aimed at x86 computers with megabytes of RAM minimum. It used uint64_ts in a lot of places and seemed like it would have a pretty high overhead on a little CC2538.

I tried avr-libc‘s malloc — it’s far simpler, and actually is a free-list implementation like newlib‘s version, but there is a snag. See, AVR microcontrollers are 8-bit beasts, they don’t care about memory alignment. But the Cortex M3 does! avrlibc_malloc did its job, handed back a pointer, but then I wound up in a HARDFAULT condition because mbedTLS tried to access a 32-bit word that was offset by a few bytes.

A simple memory allocator

The approach I took was a crude one. I would allocate memory in fixed-sized “blocks”. I first ran the OpenThread code under a debugger and set a break-point on malloc to see what sizes it was asking for — mostly blocks around the 128 byte mark, sometimes bigger, sometimes smaller. 64-byte blocks would work pretty well, although for initial testing, I went the lazy route and used 8-byte blocks: uint64_ts.

In my .bss, I made an array of uint8_ts; size equal to the number of 8-byte blocks in the lower heap divided by 4. This would be my usage bitmap — each block was allocated two bits, which I accessed using bit-banding: one bit I called used, and that simply reported the block was being used. The second was called chained, and that indicated that the data stored in this block spilled over to the next block.

To malloc some memory, I’d simply look for a string of free blocks big enough. When it came to freeing memory, I simply started at the block referenced, and cleared bits until I got to a block whose chained bit was already cleared. Because I was using 8-byte blocks, everything was guaranteed to be aligned.

8-byte blocks in 16KiB (2048 blocks) wound up with 512 bytes of usage data. As I say, using 64-byte blocks would be better (only 256 blocks, which fits in 64 bytes), but this was a quick test. The other trick would be to use the very first few blocks to store that bitmap (for 64-byte blocks, we only need to reserve the first block).

The scheme is somewhat inspired by the buddy allocator scheme, but simpler.

Bit banding was simple enough; I defined my struct for accessing the bits:

struct lowheap_usage_t {
        uint32_t used;
        uint32_t chained;

and in my code, I used a C macro to do the arithmetic:

#define LOWHEAP_USAGE                                                   \
        ((struct lowheap_usage_t*)(((((uint32_t)&lowheap_usage_bytes)   \
                                     - 0x20000000)                      \
                                    * 32)                               \
                                   + 0x22000000))

The magic numbers here are:

  • 0x20000000: the start of SRAM on the CC2538
  • 0x22000000: the start of the SRAM bit-band region
  • 32: the width of each word in the CC2538

Then, in my malloc, I could simply call…

struct lowheap_usage_t* usage = LOWHEAP_USAGE;

…and treat usage like an array; where element 0 was the usage data for the very first block down the bottom of SRAM.

To implement a memory allocator, I needed five routines:

  • one that scanned through, and told me where the first free block was after a given block number (returning the block number) — static uint16_t lowheap_first_free(uint16_t block)
  • one that, given the start of a run of free blocks, told me how many blocks following it were free — static uint16_t lowheap_chunk_free_length(uint16_t block, uint16_t required)
  • one that, given the start of a run of chained used blocks, told me how many blocks were chained together — static uint16_t lowheap_chunk_used_length(uint16_t block)
  • one that, given a block number and count, would claim that number of blocks starting at the given starting point — static void lowheap_chunk_claim(uint16_t block, uint16_t length)
  • one that, given a starting block, would clear the used bit for that block, and if chained was set; clear it and repeat the step on the following block (and keep going until all blocks were freed) — static void lowheap_chunk_release(uint16_t block)

From here, implementing calloc was simple:

  1. first, try the newlib calloc and see if that succeeded. Return the pointer we’re given if it’s not NULL.
  2. if we’re still looking for memory, round up the memory requirement to the block size.
  3. initialise our starting block number (start_nr) by calling lowheap_first_free(0) to find the first block; then in a loop:
    • find the size of the free block (chunk_len) by calling lowheap_chunk_free_length(start_nr, required_blocks).
    • If the returned size is big enough, break out of the loop.
    • If not big enough, increment start_nr by the return value from lowheap_chunk_used_length(start_nr + chunk_len) to advance it past the too-small free block and the following used chunk.
    • Stop iterating of start_nr is equal to or greater than the total number of blocks in the heap.
  4. If start_nr winds up being past the end of the heap, fail with errno = ENOMEM and return NULL.
  5. Otherwise, we’re safe, call lowheap_chunk_claim(start_nr, required_blocks); to reserve our space, zero out the actual blocks allocated, then return the address of the first block cast to void*.

Implementing free was not a challenge either: either the pointer was above our heap, in which case we simply passed the pointer to newlib‘s free — or if it was in our heap space, we did some arithmetic to figure out which block that address was in, and passed that to lowheap_chunk_release().

I won’t publish the code because I didn’t get it working properly in the end, but I figured I’d put the notes here on how I put it together to re-visit in the future. Maybe the thoughts might inspire someone else. 🙂