Teleconferencing Windows audio utility for sound quality enhancement and audio effect generation via a WASAN for the Master of Electronics and ICT course of R&D at KU Leuven 2020-2021
Teleconferencing Windows audio utility for sound quality enhancement and audio effect generation via a WASAN
for the Master of Electronics and ICT course of R&D at KU Leuven 2020-2021.
The utility performs aggregation of an arbitrary number of devices, each of arbitrary number of channels, and sampled each at an
arbitrary sample rate, both hardware, software, and networked. In case of the first two, as long as they are enumerable by the OS,
the third, as long as WiFi-Direct connection to it can be established.
The aggregator uses 2 threads for capture:
It efficiently sample rate converts all of the streams to the user chosen DSP sample rate (i.e 48kHz), using the
Flexible Sample Rate Conversion algorithm by Julius O. Smith, and places each stream
into the equisampled ring buffer.
The data is then available for consumption by the polyphonic DSP threads, which each apply desired
audio effects on each, one, or multiple ring buffer channels concurrently.
After processing each batch of frames, the DSP block optionally mixes, scales, or combines channels into completely new
and then places the resulting data into the output ring buffer. Hence number of output ring buffer channels directly depends
on the user’s set of applied audio effects and mixing options; it is possible and expected that the system will produce
more ring buffer channels than what are consumed by output devices to allow multiplexing for playback devices.
Once data is available, render thread routes each of the output ring buffer channels into playback devices according to the
channel selection, a.k.a channel mask, chosen by the user. The render thread pushes data into the playback device through the final
SRC block which works identically to the capture process above, operating in-place without copying and places filtered result directly
into the memory provided by the kernel for direct playback.
Storing original equisampled filtered data in the output ring buffer guarantees instant channel multiplexing and ability to route
same data into simultaneously multiple distinct devices.
System architecture vision:
The tool facilitates communication with the user via a CLI to interactively update, tune or select settings. by Daniele Pallastrelli (daniele77@github)
is used for this purpose.
Picture below shows how data flows internally through the system, starting at the capture side and ending at the render side.
The system does all the processing in-place, without unnecessary, time-consuming, and memory occupying copy-and-move operations: