Learn how Linux/FFmpeg C partial codebase is organized to be extensible and act as if it were meant to have “polymorphism”. Specifically, we’re going to briefly explore how Linux concept of everything is a file works at the source code level as well as how FFmpeg can add support fast and easy for new formats and codecs.
Good software design – Introduction
To write useful and long term maintainable software we tend to look out for patterns and group them into abstractions and it seems that’s the case for devs behind Linux and FFmpeg too.
When we’re creating software, we’re building data structures and defining their behaviors and dependencies. The way we create and link them can be seen as the design/architecture of the software.
Let’s say we’re building a media framework that encodes/decodes video and audio. The codecs AV1, H264, HEVC, and AAC all do some common operations and if we can provide a generic abstraction that holds these common operations and data we can use this concept instead of relying on the concrete idea of what a specific codec does.
Through the years many developers noticed that software with a good design is a good idea that pays off as software grows in complexity.
This is one of the ideas behind the good design for software, to rely on components that are weakly linked and with boundaries around what it should do.
Maybe it’s easier to see all these concepts in practice. Let’s code a quick pseudo media stream framework that provides encoding and decoding for several codecs.
This pseudo-code in ruby tries to recreate what we’re discussing above, there is an implicit concept here of what operations a codec must have, in this case, the operations are encode and decode. Since ruby is a dynamically typed language any class can present these two operations and act as a codec for us.
Developers sometimes may use the words: contract, API, interface, behavior and operations as synonyms.
This design might be considered good because if we want to add a new codec we just need to provide an implementation and add it to the list, even the list could be built in a dynamic way but the idea is that this code seems easy to extend and maintain because it tries to keep link between the components weak (low coupling) and each component does only what it should do (cohese).
Rails framework even enforce some way to organize the code, it adopts the model-view-controller (MVC) architecture
When we go (no pun intended) to a statically typed language like golang we need to be more formal, describing the required types but it’s still doable.
The interface type in golang is much more powerful than Java’s similar construct because its definition is totally disconnected from the implementation and vice versa. We could even make each codec a ReadWriter and use it all around.
In the C language we still can create the same behavior but it’s a little bit different.
Code inspired by https://www.bottomupcs.com/abstration.xhtml
We first define the abstract operations (functions in this case) in a generic struct and then we fill it with the concrete code, like the av1 decoder and encoder real code.
Many other languages have somewhat similar mechanisms to dispatch methods or functions as if they were part of an agreed protocol and then the system integration code can deal only with this high-level abstractions.
Linux Kernel – Everything is a file
Have you ever heard the expression everything is a file in Linux? The idea is to have a common interface for all kinds of resources in Linux, for instance, Linux handles network socket, special files (like /proc/cpuinfo) or even USB devices as files.
This is a powerful idea that can make easy to write or use programs for linux since we can rely in a set of well known operations from this abstraction called file. Let’s see this in action:
This only is possible because the concept of a file (data structure and operations) was design to be one of the main way to communicate among sub-systems. Here’s a gist of the file_operations’ API.
The struct file_operations define what one should expect from a concept of what file can do.
Here we can see the directory implementation of these operations for the ext4 file system.
And even the cpuinfo proc files is done over this abstraction. When you’re operating files under linux you’re actually dealing with the VFS system, this system delegates to the proper implementation file implemenation.
FFmpeg – Formats
Here’s an overview of FFmpeg flow/architecture that shows that the internal componets are linked mostly to the abstract concepts like AVCodec but not directly to their implemenation, H264, AV1 or etc.
For the input files, FFmpeg creates a struct called AVInputFormat that is implemented by any format (video container) that wants to be used as an input. MKV files fill this structure with its implementation as the MP4 format too.
This design allows new codecs, formats, and protocols to be integrated and released easier. DAV1d (an av1 open-source implementation) was integrated into FFmpeg May this year and you can follow along the commit diff to see how easy it was. In the end, it needs to register itself as an available codec and follow the expected operations.