During the summer of 2023, I participated in the Google Summer of Code program as a contributor to Chromium, Google’s open-source browser project. The program lasted for a total of 15 weeks, as I had a two-week extension.
When screen sharing or casting a screen from Chrome, you can also stream system audio alongside video. However, system audio streaming has so far only been available on Windows. My project was focused on adding support for system audio capture on macOS and Linux. The initial goals of the project were:
- Create a system audio capture implementation for macOS using the ScreenCaptureKit framework.
- Enable system audio capture on Linux using the existing PulseAudio implementation.
- Create a system audio capture implementation for Linux using the PipeWire.
- Enable screen casting to audio-only devices.
System Audio Capture on macOS
I began the project with the macOS implementation. In 2021, Apple released ScreenCaptureKit; a framework that allows an application to capture screens, windows, and applications. With macOS 13, support for audio capture was added. ScreenCaptureKit has already found its place in OBS, WebKit, and even Chromium, though only for screen capture.
After going through several design proposals, and evaluating complexity and feasibility, we’ve settled for an implementation in the form of an
AudioInputStream in the audio service. What initially seemed like a pretty straightforward design, became more and more complex each week, thanks to the design of ScreenCaptureKit. I faced numerous challenges like:
- Figuring out which system services the API requires access to, as the audio service sandbox had to be adjusted.
- Determined by first disabling the sandbox, implementing the API functions, re-enabling the sandbox, and monitoring the sandbox logs for denied access.
- How to circumvent the preliminary TCC permission checks done by the API, as we did not want to expose the sensitive
com.apple.tccd.systemservice to the sandboxed process.
- Solved by swizzling the ScreenCaptureKit method responsible for the permission check.
- TCC permissions not being correctly inherited from the parent browser process.
- The responsible process app bundle has to be code signed in order for permissions to be inherited.
- How to safely make the asynchronous API functions synchronous.
- Using various synchronization mechanisms in Chromium, I was able to build a safe, albeit rather complex design, which should hold up even if bugs show up in the API.
SCKAudioInputStream was accompanied by a unit test, which was tricky to develop, as I had to mock the API using OCMock and encountered a plethora of issues along the way.
To make system audio capture accessible, I added two new feature flags:
MacLoopbackAudioForScreenShare for enabling support for system audio capture exclusively for casting and screen sharing, respectively. Media requests go through
DesktopMediaPicker, which had to be modified by adding a
RequestSource to each request to differentiate between different components that are requesting screen capture. The flags were also made accessible through the
Expanding the PulseAudio Implementation
PulseAudio does not offer a separate API designed for system audio capture, however, by capturing audio from monitor devices, we can achieve that exact goal. Each sink, by default, has a monitor device, which is a source that simply loops back the audio from the sink so that it can be recorded. The existing
was already capable of opening monitor devices, as with any other source. However, whenever the default sink changes, we need to open the monitor of the new default sink to keep on capturing system audio.
To achieve this, I created a
PulseLoopbackAudioStream. The manager is used to notify the stream whenever the default sink changes.
PulseLoopbackAudioStream is a kind of a wrapper around
PulseAudioInputStream, and allows for switching of the source.
During implementation, I have encountered a bug in
PulseAudioInputStream. The problem was the incorrect calculation of capture time, which I had to fix before proceeding.
State Of the Project
At the time of writing, only the PulseAudio capture time bugfix has been merged. The macOS and PulseAudio system audio capture implementations have been completed, but are awaiting final approval.
We have concluded that a PipeWire implementation for system audio capture would be beneficial, as it would allow for application audio capture, however, since there is no existing support for PipeWire, the idea was out of the scope of this project. The design would likely require an
AudioManager for PipeWire, alongside input and output streams.
Feature flags for system audio capture on Linux are yet to be added.
Support for streaming to audio-only devices has not been added, yet. This CL includes the required changes.
In general, the primary goals of the project have been achieved. The PipeWire implementation and casting to audio-only devices were secondary goals, which were unfortunately not met.
System audio capture for macOS has proven to be quite a bit more difficult than I initially thought. The API documentation is not the greatest, and there were multiple issues due to the design of Chrome, which increased design complexity. System audio capture for Linux was more straightforward to implement.
I learned a lot throughout the project, from macOS internals, Objective-C, and synchronization techniques, to the audio subsystem of Linux. Overall, I am happy with the results, as system audio capture is now supported on all major platforms.
I would like to thank everyone who helped me throughout different stages of the project. Special thanks go to my mentor Mark Foltz; Olga Sharonova, who helped review most of the code and provided valuable design suggestions; Robert Sesek, who saved the project when I encountered security issues; Elad Alon, who helped me resolve git issues; and the rest of Chrome’s media team.