Hardware accelerated encoding and decoding in Jami
Jami gives you the ability to make video calls in HD with your friends, wherever they are in the world. This feature can seem simple, but a lot of processing is necessary from the raw output of your camera to your friend’s screen for everything to run smoothly. In order to reduce bandwidth usage, video streams are encoded into a more efficient format by compressing the data. They are then decoded back into RGB format at the other end of the transmission for display. There are multiple standard algorithms for encoding and decoding video streams, which are called “codecs”, and Jami is compatible with many of them including H264, VP8, MP4V-ES and H263.
This solves the bandwidth problem but it also creates another one, as a lot of computing power is required for video processing. While CPUs can do this, they are not optimized for it and a great proportion of their capacity can be required, making them unavailable for their other tasks and slowing down the whole device. This is why most computers and smartphones have the ability to delegate this process to a dedicated hardware that is specialized for it, the GPU. This is called hardware accelerated encoding/decoding and it greatly improves the speed of execution while liberating the CPU for its main duties.
The implementation of hardware acceleration in Jami was a challenging task, mainly because of the great variety of supported platforms and devices with their own particular configurations. Luckily, the latest version of ffmpeg provides a high level hardware acceleration API that abstracts the platform specific interfaces, greatly reducing the complexity of correctly handling all of them.
We strive to comply with the zero-copy principle for the media encoding and decoding on all platforms. This drastically improves performance by never making an extra copy of the data in memory, which requires every single pixel of every single frame to be iterated over. The video content is processed directly from the sensors to the screen or to the network for transfer. However, this is very difficult to implement due to the immense complexity of supporting the profusion of existing system architectures. Currently, Jami only satisfies the zero-copy principle on the iOS platform, because it has fewer variations and is fully supported by ffmpeg. We aim to bring this optimization to the other platforms as well in the future, but it will take some time.
In conclusion, hardware acceleration makes the process of encoding and decoding video much more efficient, reducing the bandwidth necessary for transmission while preserving the quality as much as possible and minimizing the load on the CPU. The balance between bandwidth and quality can be controlled in ffmpeg when encoding video by adjusting the bitrate. The next article will discuss a new feature we are currently implementing and that will automatically adapt the bitrate to the transfer capacity of the device’s Internet connection.
By Kateryna Kostiuk and François Naggar-Tremblay