(these notes are being posted in two parts to make the length more manageable,
part 1 is here
)
Continuing from where we left off, about topics discussed at the PipeWire hackfest in Nice…
DSP features
We discussed a number of features related to digital signal processing blocks which are typically realised on specialised hardware (often a DSP core that can directly interface with physical audio inputs and outputs on your laptop/phone/…).
There is currently no standard way for the firmware running on these DSPs to signal what features can be realised directly on DSP. We also would want to allow such features, if exposed from PipeWire, to be realisable on CPU.
Now we do have a way to hide away signal processing in a specific node, which is the
filter-graph
parameter on the
audioconvert
node that wraps all audio nodes.
We could extend this mechanism to allow the internal node (say the ALSA node implementation), to expose what filtering it can perform “in hardware” (i.e. the software running on DSP). This would allow the
audioconvert
to delegate some or all processing to the internal node, with fallbacks available on the CPU.
We would need a number of pieces to do this, including:
-
Some standard definition of filters and associated parameters, so different implementations could have a standard “API” to express any given filter.
-
The DSP block would need to expose what features it has and how they might be used. We could imagine extending the ALSA UCM configuration to do that.
-
The
audioconvert
node would need to have a way to push down
filter-graph
params to the internal node, and negotiate what work it is doing vs. what is being delegated
This is a non-trivial effort, but gives us some sketch of what might be possible.
More DSP features
In addition to standard filters, we spoke about two topics that have come up commonly in the past.
The first is some way to expose the processing graph in the DSP, so PipeWire and other userspace daemons have a better view of what is happening on the DSP. With the ability to push dynamic topologies to DSP, there was some renewed interest in exposing and using the ASoC DAPM widget graph. As always, the devil is in the details.
The second thing that came up is speaker calibration. There is a lot of processing and tuning that goes into driving speakers on modern devices as much as possible without destroying them. Some of these are one-time parameters decided at product design time, and some of these translate to runtime parameters based on voltage and current feedback from the speaker amplifier.
For some systems (like Qualcomm platforms), speaker calibration might be run on each system start to perform dynamic tuning. We had some discussion of how this might tie in with the rest of the system for both determining the parameters (separate startup daemon vs. in-process initialisation), as well as uploading parameters to the speaker (some ALSA UCM extensions to load parameters on PCM open but before start, or preloading parameters into ALSA kernel controls and having the driver feed them in at the right point).
Volume limits
A way to set a limit on the maximum volume for a given device has been a common user request ([
1
] [
2
]). We discussed the possibility of creating a per-route property (with a fallback to the node, if there are no routes), which WirePlumber could manage to provide users a simple interface to control.
Since the hackfest, Wim has already
done some work
on this, and we need to bubble this up as a more user-accessible setting.
Performance
A number of performance-related topics were discussed.
The first was an option of a combined DSP mode, where instead of one port per channel, a node would expose one port for all the channels of the stream (but continue to run in the configured “DSP” format/rate). This would improve stream performance for non-JACK-like use-cases, especially in resource-constrained environments.
On the WirePlumber side, there was a discussion about using LuaJIT instead of standard Lua. There are some compatibility issues to be determined there (such as language version supported, etc.), but there might be some quick performance wins to be made if this is feasible.
There is a plan to move some of the WirePlumber core to Rust, and that might be a good time to also port over some of the more standard functionality that tends not to change from Lua to Rust (though that could happen in a Lua->C transition and does not really need to wait on a Rust port).
Declarative Session Management
Another interesting, and broader, thread is the imperative nature of WirePlumber scripts – that is, policy decisions and associated action are often interwoven. It might be helpful to be able to make a clearer split where all policy decisions are first run, and then decisions are translated into actions at one go.
There are some historical choices that make this hard – for example, changing the profile of a device might create and destroy nodes, which makes it hard to be able to make decisions that are independent of the action. There were some ideas around redoing the profile concept such that all nodes are
always
exposed, but nodes could get a new state to signal availability (and profiles that would allow availability to change). That might make a declarative system possible to implement.
We also discussed the possibility of a “transaction” system. Something that would allow a client to submit a set of objects (think links between nodes), and then “commit” that transaction. This would also help reduce the number of roundtrips between PipeWire and WirePlumber, and generally help performance.
Bluetooth
Being colocated with the BlueZ face-to-face meeting, we had representation from the BlueZ community, so we were able to dive into a number of topics related to Bluetooth, primarily LE Audio.
The first topic was Auracast, the LE Audio system for broadcast audio, allowing listeners to tune into public broadcasts in a space, or to have a device stream audio to multiple headsets concurrently for shared listening. George had a demo system showing an implementation of Auracast with PipeWire, WirePlumber and BlueZ.
We had some discussion of where this feature should live, and the consensus was that we would probably want a separate daemon to manage Auracast settings and loading up the appropriate nodes (either for receiving or sending) based on users’ preferences.
This led to a more general discussion about the current split of the Bluetooth implementation in PipeWire being SPA modules, which include streaming and some policy, and a lot more policy living inside WirePlumber. We could, and likely should, move all of this into higher level PipeWire modules instead, which could make these easier to work with overall.
There was also a discussion about the complexities of LE Audio, and the state of the current user experience with actual devices:
-
Device interop is not always great, as the spec is new, the BlueZ implementation is still being completed, and device implementations seem of variable quality
-
Reliable pairing/feature detection is hard, partly due to how BlueZ exposes the ability to talk to devices in Bluetooth Classic or Bluetooth LE modes
-
Pairing left/right pairs currently needs individual pairing, which does not seem to be needed by other implementations (Android for example)
-
Inter-device synchronisation might need some work as well
While there is much work to be done here, the pieces are coming together for first-class LE Audio support on Linux-based systems.
Audio analytics
We also spoke about “analytics” – using local neural networks to implement things like text-to-speech, speech-to-text, language translation, or other forms of processing.
These pose an interesting problem, because they look like a standard-ish audio stream on one side, but are effectively a sparse stream on the other side if we are talking about text. Even conversion between languages does not look like a standard filter, because the underlying model might consume a varying amount of data before generating an output, and the input and output lengths are not tightly correlated.
While it should be possible to implement such a system with PipeWire, it is not quite clear whether we
should
. As the application space in this area becomes more mature, it may become clearer what the right place in the stack is for these features.
Click detection and elimination
We spoke about detecting and eliminating clicks at the stop or start of a stream.
If an application is playing back audio, and suddenly stops (i.e. feeds silence, or just nothing), then the sudden drop in the signal might cause a click to be output. If you think of the corresponding waveform as representing the physical displacement of the speaker, then the drop to zero is like a sudden brake to a halt, which isn’t possible, and manifests as a jolt that you hear as a clicky noise. The same analogy holds for resuming from a pause, but in the opposite direction.
The solution is usually to smooth out the end of the sound by fading out, but most applications do not do this, so this problem manifests quite clearly for most browser or application streams if you listen closely.
Wim described a number of experiments he has done for detecting such abrupt changes in
audioconvert
, but he was not happy with the results. We discussed some of these approaches, and what might work as acceptable tradeoffs to capture the most common cases while still trying to respect the integrity of the signal being sent by the application.
(sorry about the vagueness here, I missed taking more detailed notes)
Miscellanea
The rest of the discussion covered disparate topics that I don’t have long form notes on:
-
Hardware profiles: Shipping hardware-specific configuration for PipeWire and WirePlumber is hard. We discussed some approaches using context properties and conditions, but this is an area that needs more work.
-
Data loop management: PipeWire allows splitting work across data loops so different nodes in a graph can be assigned to different threads. This is currently an all-or-nothing system, where either all nodes go to a single data loop, or every node must be manually assigned a specific data loop. There was some desire to have the ability for there to be a default data loop to make the manual management less cumbersome.
-
ACP -> UCM: PipeWire inherits the ALSA card profile configuration from PulseAudio, which has been helpful in making the migration path smoother on most hardware. There was always some desire to have a single configuration system (probably ALSA UCM) for all hardware, but this likely needs some work on what we can express in UCM configuration, but we also need to clean up how we translate our UCM handling code (George has an
RFC for this
).
Thanks
That’s it, thank you for reading if you made it this far, and a shout out to George, Mark, and others organising the event!
It was great to see continued interest and so much exciting work that is yet to come. I hope to see more of the community in the next edition of the hackfest.