Artificial intelligence in real-time communications has the potential to greatly improve both video and audio quality. By dynamically altering the video and audio in real-time communication, the quality and reliability of the call can be enhanced to enrich the user experience. Video and audio regeneration, audio noise reduction, and dynamic video codec implementations have the potential to refine real-time communications and improve the user experience.
Video and Audio Regeneration
Auto-regeneration of image and audio is an interesting potential machine learning application for real-time communications. This class of algorithms generates higher quality images from lower quality images. For example, RAISR: Rapid and Accurate Image Super-Resolution, is an artificial intelligence technique that uses machine learning to create high-quality versions of low-resolution images. These types of algorithms could be used to transmit lower quality video from the sender, which is then regenerated to a higher quality video at the receiver. This would not only improve bandwidth utilization but the reliability of transmission as well. Since lower quality video requires less data, it is easier to transmit.
Keeping in mind the privacy and security implications, a similar technique could be applied to audio as well. An artificial intelligence technique could cut out the superfluous audio, which could then be predictively interpreted and added in on the receiving side.
Audio Noise Reduction and Echo Cancellation
Noisy environments are the scourge of any successful phone call, whether it be because of wind, loud cars, or any number of other situations we experience in daily life. Machine learning algorithms can detect and suppress loud noises to separate your conversation from the unintelligible and unimportant background. Apple and Mozilla have experimented with and in some cases implemented noise cancellation to differentiate between voice and background noise. For example, with Mozilla’s RNNoise: Learning Noise Suppression, they are able to differentiate between voice and noise and drastically reduce the background noise.
Dynamic Video Codecs
Choosing the right video codec with the right parameters can be a daunting task, but one that can have great video optimization results. By being able to dynamically judge and adapt the chosen video codec to the video based on the quality requirements and network allowances, the quality of the call could be greatly improved to the end-user.
Netflix does something similar to this for their streamed video. They have created Dynamic Optimizer, an artificial intelligence algorithm that matches the compression to the content within the scene. This is able to reduce the stream size while simultaneously preserving quality.
Along the same vein, our product, Optimize, uses artificial intelligence algorithms to provide media and network settings that deliver optimal audio and video quality for each device, connection, media, and network setup. We leverage data from millions of conferences to bring optimal audio and video quality to every interaction.