Oct 222018
 

Today it seems, the IT gremlins have been out to get me.  At my work I have a desktop computer (personal hardware) consisting of a Rysen 7 1700, 16GB RAM, a 240GB Intel M.2 SATA SSD (540 series) and a 4TB Western Digital HDD.

The machine has been, pretty reliable, not rock-solid, in particular, compiling gcc sometimes segfaulted for reasons unknown (the RAM checks out okay according to memtest86), but for what I was doing, it mostly ran fine.  I put up with the minor niggles with the view of solving those another day.  Today though, I come in and find X has crashed.

Okay, no big deal, re-start the display manager, except that crashed too.

Hmm, okay, log in under my regular user account and try startx:  No dice, there’s no room on /.

Ahh, that might explain a few things, we clean up some log files, truncate a 500MB file, manage to free up 50GB (!).

The machine dual-boots two OSes: Debian 9 and Gentoo.  It’s been running the latter for about 12 months now, I used Debian 9 to get things rolling so I could use the machine at work (did try Ubuntu 16.04, but it didn’t like my machine), and later, used that to get Gentoo running before switching over.  So there was a 40GB partition on the SSD that had a year-old install of Debian that wasn’t being used.  I figured I’d ditch it, and re-locate my Gentoo partition to occupy that space.

So I pull out an Ubuntu 18.04 disc, boot that up, and get gparted going.  It’s happily copying, until WHAM, I was hit with an I/O error:

Failed re-location of partition (click to enlarge)

Clicking any of the three buttons resulted in the same message.  Brilliant.  I had just copied over the first 15GB of the partition, so the Debian install would be hosed (I was deleting it anyway), but my Gentoo root partition should still be there intact at its old location.  Of course the partition table was updated, so no rolling back there.  At this point, I couldn’t do anything with the SSD, it had completely stalled, and I just had to cut my losses and kill gparted.

I managed to make some room on the 4TB drive shuffling some partitions around so I could install Ubuntu 18.04 there.  My /home partition was btrfs on the 4TB drive (first partition), the rest of that drive was LVM.  I just shrank my /home down by 40GB and slipped it in there.  The boot-loader didn’t install (no EFI partition), but who cares, I know enough of grub to boot from the DVD and bootstrap the machine that way.  At first it wouldn’t boot because in their wisdom, they created the root partition with a @ subvolume.  I worked around that by making the @ subvolume the default.

Then there was momentary panic when the /home partition I had specified lacked my usual files.  Turned out, they had created a @home subvolume on my existing /home partition.  Why? Who knows?  Debian/Ubuntu seem to do strange things with btrfs which do nothing but complicate matters and I do not understand the reasoning.  Editing /etc/fstab to remove the subvolume argument for /home and re-booting fixed that.

I set up a LVM volume that would receive a DD dump of the mangled partition to see what could be saved.  GNU’s ddrescue managed to recover most of the raw partition, and so now I just had to find where the start was.  If I had the output of fdisk -l before I started, I’d be right, but I didn’t have that foresight.  (Maybe if I had just formatted a LVM volume and DD’d the root fs before involving gparted?  Never mind!)

I figured there’d be some kind of magic bytes I could “grep” for.  Something that would tell me “BTRFS was here”.  Sure enough, the information is stashed in the superblock.  At 0x00010040 from the start of the partition, I should see the magic bytes 5f 42 47 52 66 53 5f 4d.  I just needed to grep for these.  To speed things up I made an educated guess on the start-location.  The screenshot says the old partition was about 37.25GB in size, so that was a hint to maybe try skipping that bit and see what could be seen.

Sure enough, I found what looked to be the superblock:

root@vk4msl-ws:~# dd if=/dev/mapper/scratch-rootbackup skip=38100 count=200 bs=1M | hexdump -C | grep '5f 42 48 52 66 53 5f 4d'
02e10040  5f 42 48 52 66 53 5f 4d  9d 30 0d 02 00 00 00 00  |_BHRfS_M.0......|
06e00040  5f 42 48 52 66 53 5f 4d  9d 30 0d 02 00 00 00 00  |_BHRfS_M.0......|
200+0 records in
200+0 records out

Some other probes seem to confirm this, my quarry seemed to start 38146MB into the now-merged partition.  I start copying that to a new LVM volume with the hope of being able to mount it:

root@vk4msl-ws:~# dd if=/dev/mapper/scratch-rootbackup of=/dev/mapper/scratch-gentoo--root bs=1M skip=38146

Whilst waiting for this to complete, I double-checked my findings, by inspecting the other fields. From the screenshot, I know my filesystem UUID was 6513-682e-7182-4474-89e6-c0d1c71866ad. Looking at the superblock, sure enough I see that listed:

root@vk4msl-ws:~# dd if=/dev/scratch/gentoo-root bs=$(( 0x10000 )) skip=1 count=1 | hexdump -C
1+0 records in
1+0 records out
00000000  5f f9 98 90 00 00 00 00  00 00 00 00 00 00 00 00  |_...............|
65536 bytes (66 kB, 64 KiB) copied, 0.000116268 s, 564 MB/s
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  65 13 68 2e 71 82 44 74  89 e6 c0 d1 c7 19 66 ad  |e.h.q.Dt......f.|
00000030  00 00 01 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
00000040  5f 42 48 52 66 53 5f 4d  9d 30 0d 02 00 00 00 00  |_BHRfS_M.0......|
00000050  00 00 32 da 32 00 00 00  00 00 02 00 00 00 00 00  |..2.2...........|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

Looks promising! After an agonising wait, the dd finishes. I can check the filesystem:

root@vk4msl-ws:~# btrfsck /dev/scratch/gentoo-root 
Checking filesystem on /dev/scratch/gentoo-root
UUID: 6513682e-7182-4474-89e6-c0d1c71966ad
checking extents
checking free space cache
block group 111690121216 has wrong amount of free space
failed to load free space cache for block group 111690121216
block group 161082245120 has wrong amount of free space
failed to load free space cache for block group 161082245120
checking fs roots
checking csums
checking root refs
found 107544387643 bytes used, no error found
total csum bytes: 99132872
total tree bytes: 6008504320
total fs tree bytes: 5592694784
total extent tree bytes: 271663104
btree space waste bytes: 1142962475
file data blocks allocated: 195274670080
 referenced 162067775488

Okay, it complained that the free space was wrong (which I’ll blame on gparted prematurely growing the partition), but the data is there!  This is confirmed by mounting the volume and doing a ls:

root@vk4msl-ws:~# mount /dev/scratch/gentoo-root /mnt/
root@vk4msl-ws:~# ls /mnt/ -l
total 4
drwxr-xr-x 1 root root 1020 Oct  7 14:13 bin
drwxr-xr-x 1 root root   18 Jul 21  2017 boot
drwxr-xr-x 1 root root   16 May 28 10:29 dbus-1
drwxr-xr-x 1 root root 1686 May 31  2017 dev
drwxr-xr-x 1 root root 3620 Oct 19 18:53 etc
drwxr-xr-x 1 root root    0 Jul 14  2017 home
lrwxrwxrwx 1 root root    5 Sep 17 09:20 lib -> lib64
drwxr-xr-x 1 root root 1156 Oct  7 13:59 lib32
drwxr-xr-x 1 root root 4926 Oct 13 05:13 lib64
drwxr-xr-x 1 root root   70 Oct 19 11:52 media
drwxr-xr-x 1 root root   28 Apr 23 13:18 mnt
drwxr-xr-x 1 root root  336 Oct  9 07:27 opt
drwxr-xr-x 1 root root    0 May 31  2017 proc
drwx------ 1 root root  390 Oct 22 06:07 root
drwxr-xr-x 1 root root   10 Jul  6  2017 run
drwxr-xr-x 1 root root 4170 Oct  9 07:57 sbin
drwxr-xr-x 1 root root   10 May 31  2017 sys
drwxrwxrwt 1 root root 6140 Oct 22 06:07 tmp
drwxr-xr-x 1 root root  304 Oct 19 18:20 usr
drwxr-xr-x 1 root root  142 May 17 12:36 var
root@vk4msl-ws:~# cat /mnt/etc/gentoo-release 
Gentoo Base System release 2.4.1

Yes, I’ll be backing this up properly RIGHT NOW. But, my data is back, and I’ll be storing this little data recovery technique for next time.

The real lesson here is:

  1. KEEP RELIABLE BACKUPS! You never know when something will fail.
  2. Catch the copy process before it starts overwriting your source data! If there’s no overlap between the old and new locations, you’re fine, but if there is and it starts overwriting the start of your original volume, it’s all over red rover! You might be lucky with a superblock back-up, but don’t bet on it!
  3. Make note of the filesystem type and its approximate location. The fact that I knew roughly where to look, and what sort of filesystem I was looking for meant I could look for magic bytes that say “I’m a BTRFS filesystem”. The magic bytes for EXT4, XFS, etc will differ, but the same concepts are there, you just have to look up the documentation on how your particular filesystem structures its data.
Aug 302018
 

So, I’m happy enough with the driver now that I’ll collapse down the commits and throw it up onto the Github repository.  I might take another look at kernel 4.18, but for now, you’ll find them on the ts7670-4.14.67 branch.

Two things I observe about this voltage monitor:

  1. The voltage output is not what you’d call, accurate.  I think it’s only a 10-bit ADC, which is still plenty good enough for this application, but the reading I think is “high” by about 50mV.
  2. There’s significant noise on the reading, with noticeable quantisation steps.

Owing to these, and to thwart the possibility of using this data in side-channel attacks using power analysis, I’ve put a 40-sample moving-average filter on the “public” data.

Never the less, it’s a handy party trick, and not one I expected these devices to be able to do.  My workplace manages a big fleet of these single-board computers in the residential towers at Barangaroo where they spend all day polling Modbus and M-Bus meters.  In the event we’re at all suspicious about DC power supplies though, it’s a simple matter to load this kernel tree (they already run U-Boot) and configure collectd (which is also installed).

I also tried briefly switching off the mains power to see that I was indeed reading the battery voltage and not just a random number that looked like the voltage.  That yielded an interesting effect:

You can see where I switched the mains supply off, and back on again.  From about 8:19PM the battery voltage predictably fell until about 8:28PM where it was at around 12.6V.

Then it did something strange, it rose about 100mV before settling at 12.7V.  I suspect if I kept it off all night it’d steadily decrease: the sun has long set.  I’ve turned the mains charger back on now, as you can see by the step-rise shortly after 8:44PM.

The bands on the above chart are the alert zones.  I’ll get an email if the battery voltage strays outside of that safe region of 12-14.6V.  Below 12V, and I run the risk of deep-cycling the batteries.  Above 14.6V, and I’ll cook them!

The IPMI BMCs on the nodes already sent me angry emails when the battery got low, so in that sense, Grafana duplicates that, but does so with pretty charts.  The BMCs don’t see when the battery gets too high though, for the simple matter that what they see is regulated by LDOs.

Aug 302018
 

I’ve succeeded in getting a working battery monitor kernel module. This is basically taking the application note by Technologic Systems and spinning that into a power supply class driver that reports the voltage via sysfs.

As it happens, the battery module in collectd does not see this as a “battery”, something I’ll look at later. For now the exec plug-in works well enough. This feeds through eventually to an InfluxDB database with Grafana sitting on top.

https://netmon.longlandclan.id.au/d/IyZP-V2mk/battery-voltage?orgId=1

Aug 282018
 

So, I successfully last night, parted the core bits out of ts_wdt.c and make ts-mcu-core.c.  This is a multi-function device, and serves to provide a shared channel for the two drivers that’ll use it.

Tonight, I took a stab at writing the PSU part of it.  Suffice to say, I’ve got work to do:

[  158.712960] Unable to handle kernel NULL pointer dereference at virtual address 00000005
[  158.721328] pgd = c3854000
[  158.724089] [00000005] *pgd=4384f831, *pte=00000000, *ppte=00000000
[  158.730629] Internal error: Oops: 1 [#3] ARM
[  158.734947] Modules linked in: 8021q garp mrp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables flexcan can_dev
[  158.755812] CPU: 0 PID: 2059 Comm: cat Tainted: G      D         4.14.67-vrt-ts7670+ #3
[  158.763840] Hardware name: Freescale MXS (Device Tree)
[  158.769008] task: c68f3a20 task.stack: c3846000
[  158.773598] PC is at ts_mcu_transfer+0x1c/0x48
[  158.778073] LR is at 0x3
[  158.780630] pc : []    lr : [<00000003>]    psr: 60000013
[  158.786918] sp : c3847e44  ip : 00000000  fp : 014000c0
[  158.792165] r10: c5035000  r9 : c5305900  r8 : c777b428
[  158.797412] r7 : c0a7fa80  r6 : c777b400  r5 : c5035000  r4 : c3847e6c
[  158.803961] r3 : c3847e58  r2 : 00000001  r1 : c3847e4c  r0 : c07b8c68
[  158.810512] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  158.817671] Control: 0005317f  Table: 43854000  DAC: 00000051
[  158.823440] Process cat (pid: 2059, stack limit = 0xc3846190)
[  158.829212] Stack: (0xc3847e44 to 0xc3848000)
[  158.833611] 7e40:          c05bd150 00000001 00010000 00000004 c3847e48 00000150 c5035000
[  158.841833] 7e60: c777b420 c05bcaa4 c4f70c60 c777e070 c0a7fa80 c05bca20 00000fff c07ad4b0
[  158.850051] 7e80: c777b428 c04e46dc c4f52980 00001000 00000fff c01b1994 c4f52980 c4f70c60
[  158.858268] 7ea0: c3847ec8 ffffe000 00000000 c3847f88 00000001 c0164eac c3847fb0 c4f529b0
[  158.866487] 7ec0: 00020000 b6e3d000 00000000 00000000 c4f73f70 00000800 00000000 c01b0f60
[  158.874703] 7ee0: 00020000 c4f70c60 ffffe000 c3847f88 00000000 00000000 00000000 c013eb84
[  158.882918] 7f00: 000291ac 00000000 00000000 c0009344 00000077 b6e3c000 00000022 00000022
[  158.891135] 7f20: c686bdc0 c0117838 000b6e3c c3847f80 00022000 c686be14 b6e3c000 00000000
[  158.899354] 7f40: 00000000 00022000 b6e3d000 00020000 c4f70c60 ffffe000 c3847f88 c013ed0c
[  158.907571] 7f60: 00000022 00000000 000b6e3c c4f70c60 c4f70c60 b6e3d000 00020000 c000a9e4
[  158.915786] 7f80: c3846000 c013f2d8 00000000 00000000 00000000 00000000 00000000 00000000
[  158.924002] 7fa0: 00000003 c000a820 00000000 00000000 00000003 b6e3d000 00020000 00000000
[  158.932217] 7fc0: 00000000 00000000 00000000 00000003 00020000 00000000 00000001 00000000
[  158.940434] 7fe0: be8a62c0 be8a62ac b6eb77c4 b6eb6b9c 60000010 00000003 00000000 00000000
[  158.948691] [] (ts_mcu_transfer) from [] (ts_psu_get_prop+0x38/0xb0)
[  158.956847] [] (ts_psu_get_prop) from [] (power_supply_show_property+0x84/0x220)
[  158.966036] [] (power_supply_show_property) from [] (dev_attr_show+0x1c/0x48)
[  158.974974] [] (dev_attr_show) from [] (sysfs_kf_seq_show+0x84/0xf0)
[  158.983129] [] (sysfs_kf_seq_show) from [] (seq_read+0xcc/0x4f4)
[  158.990930] [] (seq_read) from [] (__vfs_read+0x1c/0x11c)
[  158.998117] [] (__vfs_read) from [] (vfs_read+0x88/0x158)
[  159.005304] [] (vfs_read) from [] (SyS_read+0x3c/0x90)
[  159.012232] [] (SyS_read) from [] (ret_fast_syscall+0x0/0x28)
[  159.019766] Code: e52de004 e281300c e590e004 e25cc001 (e1dee0b2) 
[  159.026278] ---[ end trace 2807dc313991fd87 ]---

The good news is the machine didn’t crash.c

Aug 262018
 

So, I had a brief look after getting kernel 4.18.5 booting… sure enough the problem was I had forgotten the watchdog, although I did see btrfs trigger a deadlock warning, so I may not be out of the woods yet.  I’ve posted the relevant kernel output to the linux-btrfs list.

Anyway, as it happens, that watchdog driver looks like it’ll need some re-factoring as a multi-function device.  At the moment, ts-wdt.c claims it based on this binding.

If I try to add a second driver, they’ll clash, and I expect the same if I try to access it via userspace.  So the sensible thing to do here, is to add a ts-companion.c MFD driver here, then re-factor ts-wdt.c to use it.  From there, I can write a ts-psu.c module which will go right here.

I think I’ll definitely be digging into those older sources to remind myself how that all worked.

Aug 252018
 

So, after some argument, and a bit of sitting on a concrete floor with the netbook, I managed to get Gentoo loaded onto the TS-7670.  Right now it’s running off the MicroSD card, I’ll get things right, then shift it across to eMMC.

ts7670 ~ # emerge --info
Portage 2.3.40 (python 3.5.5-final-0, default/linux/musl/arm/armv7a, gcc-6.4.0, musl-1.1.19, 4.14.15-vrt-ts7670-00031-g1a006273f907-dirty armv5tejl)
=================================================================
System uname: Linux-4.14.15-vrt-ts7670-00031-g1a006273f907-dirty-armv5tejl-ARM926EJ-S_rev_5_-v5l-with-gentoo-2.4.1
KiB Mem:      111532 total,     13136 free
KiB Swap:    4194300 total,   4191228 free
Timestamp of repository gentoo: Fri, 17 Aug 2018 16:45:01 +0000
Head commit of repository gentoo: 563622899f514c21f5b7808cb50f6e88dbd7d7de
sh bash 4.4_p12
ld GNU ld (Gentoo 2.30 p2) 2.30.0
app-shells/bash:          4.4_p12::gentoo
dev-lang/perl:            5.24.3-r1::gentoo
dev-lang/python:          2.7.14-r1::gentoo, 3.5.5::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.4.1-r2::gentoo
sys-apps/openrc:          0.34.11::gentoo
sys-apps/sandbox:         2.13::musl
sys-devel/autoconf:       2.69-r4::gentoo
sys-devel/automake:       1.15.1-r2::gentoo
sys-devel/binutils:       2.30-r2::gentoo
sys-devel/gcc:            6.4.0-r1::musl
sys-devel/gcc-config:     1.8-r1::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1::gentoo
sys-kernel/linux-headers: 4.13::musl (virtual/os-headers)
sys-libs/musl:            1.1.19::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://virtatomos.longlandclan.id.au/gentoo-portage
    priority: -1000
    sync-rsync-verify-jobs: 1
    sync-rsync-extra-opts: 
    sync-rsync-verify-metamanifest: yes
    sync-rsync-verify-max-age: 24

ACCEPT_KEYWORDS="arm"
ACCEPT_LICENSE="* -@EULA"
CBUILD="arm-unknown-linux-musleabi"
CFLAGS="-Os -pipe -march=armv5te -mtune=arm926ej-s -mfloat-abi=soft"
CHOST="arm-unknown-linux-musleabi"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-Os -pipe -march=armv5te -mtune=arm926ej-s -mfloat-abi=soft"
DISTDIR="/home/portage/distfiles"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=hard"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=hard"
GENTOO_MIRRORS=" http://virtatomos.longlandclan.id.au/portage http://mirror.internode.on.net/pub/gentoo http://ftp.swin.edu.au/gentoo http://mirror.aarnet.edu.au/pub/gentoo"
INSTALL_MASK="charset.alias"
LANG="en_AU.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="arm bindist cli crypt cxx dri fortran iconv ipv6 modules ncurses nls nptl openmp pam pcre readline seccomp ssl tcpd unicode xattr zlib" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon plan sheets stage words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="musl" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6 php7-0" POSTGRES_TARGETS="postgres9_5 postgres10" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6" RUBY_TARGETS="ruby23" USERLAND="GNU" VIDEO_CARDS="dummy fbdev v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, LC_ALL, LINGUAS, MAKEOPTS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

I still have to update the kernel.  I actually did get kernel 4.18 to boot, but I forgot to add in support for the watchdog, so U-Boot tickled it, then the watchdog got hungry and kicked the reset half way through the boot sequence.

Rolling back to my older 4.14 kernel works.  I’ll try again with 4.18.5 in a moment.  Failing that, I have also brought the 4.14 patches up to 4.14.69 which is the latest LTS release of the kernel.

I’ve started looking at the power supply sysfs device class, with a view to exposing the supply voltage via sysfs.  The thinking here is that collectd supports reading this via the “battery” module (and realistically, it is a battery that is being measured: two 105Ah AGMs).

Worst case is I do something a little proprietary and deal with it in user space.  I’ll have to dig up the Linux kernel tree I did for Jacques Electronics all those years ago, as that had some examples of interfacing sysfs to a Cypress PSOC device that was acting as an I²C slave.  Rather than using an off-the-shelf solution, they programmed up a MCU that did power management, touchscreen sensing, keypad sensing, RGB LED control and others, all in one chip.  (Fun to try and interface that to the Linux kernel.)

Technologic Systems appear to have done something similar.  The device ID 0x78 implies a 10-bit device, but I think they’re just squatting on that 7-bit address.  They hail 0x78 then read out 4 bytes, which the last two bytes are the supply voltage ADC readings.  They do their own byte swapping before scaling the value to get mV.

Aug 222018
 

It’s taken several months and had a few false starts, but at long last I have some stage tarballs for Gentoo Linux MUSL for ARMv5 processors.  I’m not the only one wanting such a port, even looking for my earlier thread on the matter, I stumbled on this post.  (Google translate is hopeless with Russian, but I can get the gist of what’s being said.)

This was natively built on the TS-7670 using an external hard drive connected over USB 2.0 as both swap and the chroot.  It took two passes to clean everything up and get something that’s in a “release-able” state.

I think my next steps now will be:

  • Build an updated kernel … I might see if I can expose that I²C register via a sysfs file or something that collectd can pick up whilst I’m at it.  I have the kernel sources and bootloader sources.
  • Prepare the 32GB MicroSD card I bought a few weeks back with the needed partitions and load Gentoo onto that.
  • Install the MicroSD card and boot off it.
  • Back up the eMMC
  • Re-format the eMMC and copy the MicroSD card to it.

It’s supposed to be wet this weekend, so it sounds like a good project for indoors.

Aug 212018
 

I have a bad habit where it comes to updating systems, I tend to do it less frequently than I should, and that can sometimes snowball like it has for my mail server.  Even if it’s a fresh install, sometimes there’s a large number of packages that need installing.

Now Portage does report where it’s up to, but often that has long scrolled past the buffer on your terminal.  You can look at /var/log/emerge.log for this information, but sometimes it’s nice to just see a percentage progress and a pseudo graphical representation.

With this in mind, I cooked up a little script which just tails /var/log/emerge.log and displays a progress bar along with the last message reported. The script is quite short:

#!/bin/bash

shopt -s checkwinsize

stdbuf -o L tail -n 0 -F /var/log/emerge.log | while read line; do
	changed=0
	eval $( echo ${line} | \
		sed -ne '/[0-9]\+ of [0-9]\+/ { s:^.*(\([0-9]\+\) of \([0-9]\+\)).*$:done=\1 total=\2 changed=1:; p; }' )

	if [ "${changed}" = 1 ]; then
		case "${line}" in
			*"::: completed emerge"*)
				;;
			*)
				done=$(( ${done} - 1 ))
				;;
		esac

		percent=$(( ( ${done}*100 ) / ${total} ))
		width=$(( ${COLUMNS:-80} - 8 ))
		progress=$(( ( ${done}*${width} ) / ${total} ))
		remain=$(( ${width} - ${progress} ))

		progressbar="$( for n in $( seq 1 ${progress} ); do echo -n '#'; done )"
		remainbar="$( for n in $( seq 1 ${remain} ); do echo -n ':'; done )"

		printf '\033[2A\033[2K%s\n\033[2K\033[1G[\033[1m%s\033[0m%s] \033[1m%3d%%\033[0m\n' \
			"${line:0:${COLUMNS:-80}}" "$progressbar" "$remainbar" "$percent"
	else
		printf '\033[2A\033[2K%s\n\n' "${line:0:${COLUMNS:-80}}"
	fi

	if echo "${line}" | grep -q '*** terminating.'; then
		exit
	fi
done

What’s it look like?

It works well with GNU Screen as seen above.

Aug 192018
 

So, I was just updating the project details for this project, and I happened to see this blog post about reading the DC voltage input on the TS-7670v2.

I haven’t yet gotten around to finishing the power meters that I was building which would otherwise be reading these values directly, but they were basically going to connect via Modbus to the TS-7670v2 anyway.  One of its roles, aside from routing between the physical management network (IPMI and switch console access), was to monitor the battery.

I will have to explore this.  Collectd doesn’t have a general-purpose I²C module, but it does have one for barometer modules, so with a bit of work, I could make one to measure the voltage input which would tell me what the battery is doing.

May 192018
 

Recently, a new project sprang up on the Hackaday.io site; it was for the KiteBoard, an open-source cellular development platform.  In a nutshell, this is a single-board-computer that embeds a full mobile system-on-chip and runs the Android operating system.  The project is seeking crowd funding for the second version of this platform.

With it, you can build smartphones (of course), tablets, tele-presence robots, or really, any project which can benefit from a beefy CPU with a built-in cellular modem.  It comes as a kit, which you then assemble yourself.  The level of difficulty in assembly is no greater than that of assembling a desktop PC: the circuit boards are pre-populated, you just need to connect them together.  In this version, some soldering of pushbuttons and wires is needed: all through-hole components.  No reflow ovens or solder paste is necessary here, an 8-year-old could do it.

The break-out board for the CPU card features in addition to connections for all the usual cellular phone signals (e.g. earpiece, microphone, button inputs) a GPIO header that follows the de-facto standard “Raspberry Pi” interface, allowing many Raspberry Pi “hats” to plug directly into this board.

That lends itself greatly to expandability.  Want a eInk or OLED notification display on the back?  A scrolling LED display?  A piano?  A games console?  Knock yourself out!  You, are the designer, you decide.  There are lots of options.

I for one, would consider an amateur radio transceiver, an external antenna socket and a beefier battery.  Presently, I get around with the ZTE T83 (“Telstra Dave”), which works okay, but as it runs an old version of Android (4.1), running newer applications on it is a problem.  I believe it could run something newer, but ZTE believe that their job was finished in 2013 when the first one rolled off the production line.

The box did not include a copy of the kernel sources or any link to where that could be obtained.  (GNU GPL v2 section 2b?  What’s that?)

The successor, the T84 is a little better, in fact it has pretty much the same hardware that’s in Kite, but it struggles in rural areas.  On a recent trip into the Snowy Mountains, my phone would be working fine, when my father’s T84 would report “no service available”.  Clearly, someone at Telstra/ZTE screwed up the firmware on it, and so it fails to switch networks correctly.  Without the sources, we are unable to fix that.  Even something as simple as replacing a battery is neigh on impossible, they’re built like bombs: not designed to be taken apart.

I have no desire to spend money on a company that puts out poorly supported rubbish running pirated operating system kernels.  The story is similar elsewhere, and most devices while better in specs and operating system, lack the external antenna connection that I desire in a phone.

Kite represents a breath of fresh air in that regard.  It is to smart phones, what the Raspberry Pi is to single board computers in general.  It’s not only designed to be taken apart, it’s shipped to you as parts.  Apparently with Kite v2, there’ll be schematics available, so you’ll be able to look-up the datasheets of respective components and be able to make informed decisions about part substitutions.  All antenna connections are socketed, so you can substitute at will.

While the OS isn’t going to be as open as one might like (mobile chipset manufacturers like their black boxes), it’s a BIG step in the right direction.  There’s more scope for supporting this platform long-term, than contemporary ones.

As far as actually using Kite, Shree Kumar was generous enough to organise the loan of a Kite for me to test with the Australian networks.  The phone takes up to two micro-SIMs (about 15mm×12mm); one on the daughter card (this is SIM 1) and one on the CPU card (SIM 2).

For the sake of testing, I figured I’d try it out with the two major networks, Telstra and Optus.  As it happens, my Telstra SIM is too big (they call it a “full-size” SIM now; I remember full-size SIMs being credit-card sized), so rather than chopping up my existing SIM or getting it transferred, I bought and activated a prepaid service.  I also bought a SIM for Optus.  I bought $10 credit for each.

As it happens, the Optus one came with data, the Telstra did not.  No big deal in this case.  The phone does have a limitation in that it will talk to one 3G/4G network and one GSM (2G) network at a time.  Given both networks I chose have abandoned 2G, that pretty much means the dual-SIM functionality on this model is severely hobbled.  That said, either SIM can operate in 3G mode, and so it’s simple enough to switch one SIM into 2G mode then activate the other in 3G/4G mode.  So far, the Kite has spent most of its time on Optus.

Evidently Vodaphone still have a 2G network… at least the Kite does see one 2G cell operated by them.  Long term, this is a problem that all dual-SIM phone chipset makers will have to deal with, a future Kite may well be able to do 3G simultaneously on both SIMs, but for me, this is not a show-stopper.

I’ve put together this review of the Kite.  It’s rare for me to be in front of a camera instead of behind it, and yes, the editing is very rough.  If there is time (there won’t be this weekend) I hope to take the phone out to a rural area and try it out with the more distant networks, but so far it seems happy enough to switch to 3G when I get home, and use 4G when I’m at work, so this I see as a promising sign.

The KickStarter is lagging behind quite a way in the funding goal, but alternate options are being considered for getting this project off-the-ground.  Here’s hoping that the project does get up, and that we get to see Kite v2 being developed and made for real, as I think the mobile phone industry really does need a viable open competitor.