twinkle

Phone patching to Zoom

Brisbane Area WICEN Group (Inc) lately has been caught up in this whole COVID-19 situation, unable to meet face-to-face for business meetings. Like a lot of groups, we’ve had to turn to doing things online.

Initially, Cisco WebEx was trialled, however this had significant compatibility issues, most notably, under Linux — it just straight plain didn’t work. Zoom however, has proven fairly simple to operate and seems to work, so we’ve been using that for a number of “social” meetings and at least one business meeting so far.

A challenge we have though, is that one of our members does not have a computer or smart-phone. Mobile telephony is unreliable in his area (Kelvin Grove), and so yee olde PSTN is the most reliable service. For him to attend meetings, we need some way of patching that PSTN line into the meeting.

The first step is to get something you can patch to. In my case, it was a soft-phone and a SIP VoIP service. I used Twinkle to provide that link. You could also use others like baresip, Linphone or anything else of your choosing. This connects to your sound card at one end, and a Voice Service Provider; in my case it’s my Asterisk server through Internode NodePhone.

The problem is though, while you can certainly make a call outbound whilst in a conference, the person on the phone won’t be able to hear the conference, nor will the conference attendees be able to hear the person on the phone.

Enter JACK

JACK is a audio routing framework for Unix-like operating systems that allows for audio to be routed between applications. It is geared towards multimedia production and professional audio, but since there’s a plug-in in the ALSA framework, it is very handy for linking audio between applications that would otherwise be incompatible.

For this to work, one application has to work either directly with JACK, or via the ALSA plug-in. Many support, and will use, an alternate framework called PulseAudio. Conference applications like Zoom and Jitsi almost universally rely on this as their sound card interface on Linux.

PulseAudio unfortunately is not able to route audio with the same flexibility, but it can route audio to JACK. In particular, JACKv2 and its jackdbus is the path of least resistance. Once JACK starts, PulseAudio detects its presence, and loads a module that connects PulseAudio as a client of JACK.

A limitation with this is PulseAudio will pre-mix all audio streams it receives from its clients into one single monolithic (stereo) feed before presenting that to JACK. I haven’t figured out a work-around for this, but thankfully for this use case, it doesn’t matter. For our purposes, we have just one PulseAudio application: Zoom (or Jitsi), and so long as we keep it that way, things will work.

Software tools

  • jack2: The audio routing daemon.
  • qjackctl: This is a front-end for controlling JACK. It is optional, but if you’re not familiar with JACK, it’s the path of least resistance. It allows you to configure, start and stop JACK, and to control patch-bay configuration.
  • SIP Client, in my case, Twinkle.
  • ALSA JACK Plug-in, part of alsa-plugins.
  • PulseAudio JACK plug-in, part of PulseAudio.

Setting up the JACK ALSA plug-in

To expose JACK to ALSA applications, you’ll need to configure your ${HOME}/.asoundrc file. Now, if your SIP client happens to support JACK natively, you can skip this step, just set it up to talk to JACK and you’re set.

Otherwise, have a look at guides such as this one from the ArchLinux team.

I have the following in my .asoundrc:

pcm.!default {
        type plug
        slave { pcm "jack" }
}

pcm.jack {
        type jack
        playback_ports {
                0 system:playback_1
                1 system:playback_2
        }
        capture_ports {
                0 system:capture_1
                1 system:capture_1
        }
}

The first part sets my default ALSA device to jack, then the second block defines what jack is. You could possibly skip the first block, in which case your SIP client will need to be told to use jack (or maybe plug:jack) as the ALSA audio device for input/output.

Configuring qjackctl

At this point, to test this we need a JACK audio server running, so start qjackctl. You’ll see a window like this:

qjackctl in operation

This shows it actually running, most likely for you this will not be the case. Over on the right you’ll see Setup… — click that, and you’ll get something like this:

Parameters screen

The first tab is the parameters screen. Here, you’ll want to direct this at your audio device that your speakers/microphone are connected to.

The sample rate may be limited by your audio device. In my experience, JACK hates devices that can’t do the same sample rate for input and output.

My audio device is a Logitech G930 wireless USB headset, and it definitely has this limitation: it can play audio right up to 48kHz, but will only do a meagre 16kHz on capture. JACK thus limits me to both directions running 16kHz. If your device can do 48kHz, that’d be better if you intend to use it for tasks other than audio conferencing. (If your device is also wireless, I’d be interested in knowing where you got it!)

JACK literature seems to recommend 3 periods/buffer for USB devices. The rest is a matter of experiment. 1024 samples/period seems to work fine on my hardware most of the time. Your mileage may vary. Good setups may get away with less, which will decrease latency (mine is 192ms… good enough for me).

The other tab has more settings:

Advanced settings

The things I’ve changed here are:

  • Force 16-bit: since my audio device cannot do anything but 16-bit linear PCM, I force 16-bit mode (rather than the default of 32-bit mode)
  • Channels I/O: output is stereo but input is mono, so I set 1 channel in, two channels out.

Once all is set, Apply then OK.

Now, on qjackctl itself, click the “Start” button. It should report that it has started. You don’t need to click any play buttons to make it work from here. You may have noticed that PulseAudio has detected the JACK server and will now connect to it. If click “Graph”, you’ll see something like this:

qjackctl‘s Graph window

This is the thing you’ll use in qjackctl the most. Here, you can see the “system” boxes represent your audio device, and “PulseAudio JACK Sink”/”PulseAudio JACK Source” represent everything that’s connected to PulseAudio.

You should be able to play sound in PulseAudio, and direct applications there to use the JACK virtual sound card. pavucontrol (normally shipped with PulseAudio) may be handy for moving things onto the JACK virtual device.

Configuring your telephony client

I’ll use Twinkle as the example here. In the preferences, look for a section called Audio. You should see this:

Twinkle audio settings

Here, I’ve set my ringing device to pulse to have that ring PulseAudio. This allows me to direct the audio to my laptop’s on-board sound card so I can hear the phone ring without the headset on.

Since jack was made my default device, I can leave the others as “Default Device”. Otherwise, you’d specify jack or plug:jack as the audio device. This should be set on both Speaker and Microphone settings.

Click OK once you’re done.

Configuring Zoom

I’ll use Zoom here, but the process is similar for Jitsi. In the settings, look for the Audio section.

Zoom audio settings

Set both Speaker and Microphone to JACK (sink and source respectively). Use the “Test Speaker” function to ensure it’s all working.

The patch up

Now, it doesn’t matter whether you call first, then join the meeting, or vice versa. You can even have the PSTN caller call you. The thing is, you want to establish a link to both your PSTN caller and your conference.

The assumption is that you now have a session active in both programs, you’re hearing both the PSTN caller and the conference in your headset, when you speak, both groups hear you. To let them hear each other, do this:

Go to qjackctl‘s patch bay. You’ll see PulseAudio is there, but you’ll also see the instance of the ALSA plug-in connected to JACK. That’s your telephony client. Both will be connected to the system boxes. You need to draw new links between those two new boxes, and the PulseAudio boxes like this:

qjackctl patching Twinkle to Zoom

Here, Zoom is represented by the PulseAudio boxes (since it is using PulseAudio to talk to JACK), and Twinkle is represented by the boxes named alsa-jack… (tip: the number is the PID of the ALSA application if you’re not sure).

Once you draw the connections, the parties should be able to hear each-other. You’ll need to monitor this dialogue from time to time: if either of PulseAudio or the phone client disconnect from JACK momentarily, the connections will need to be re-made. Twinkle will do this if you do a three-way conference, then one person hangs up.

Anyway, that’s the basics covered. There’s more that can be done, for example, recording the audio, or piping audio from something else (e.g. a media player) is just a case of directing it either at JACK directly or via the ALSA plug-in, and drawing connections where you need them.