We introduce the PercepNet algorithm, which combines signal processing, knowledge of human perception, and deep learning to enhance speech in real time. PercepNet ranked second in the real-time track of the Interspeech 2020 Deep Noise Suppression challenge, despite using only 5% of a CPU core.
In this demo, we turn LPCNet into a very low-bitrate neural speech codec that's actually usable on current hardware and even on phones. It’s the first time a neural vocoder is able to run in real-time using just one CPU core on a phone (as opposed to a high-end GPU).
This demo presents the LPCNet architecture that combines signal processing and deep learning to improve the efficiency of neural speech synthesis. It explains the motivations for LPCNet, shows what it can achieve, and explores its possible applications.
This demo presents the RNNoise project, showing how deep learning can be applied to noise suppression. The main idea is to combine classic signal processing with deep learning to create a real-time noise suppression algorithm that's small and fast. The result is much simpler and sounds better than traditional noise suppression systems.
This demonstrates the improvements and new features in Opus 1.3, including better speech/music detection, ambisonics support, and low-bitrate improvements.
This demonstrates the improvements and new features in Opus 1.2 compared to version 1.1. It also includes audio samples comparing to previous versions of the codec.
This describes the quality improvements that Opus 1.1 brought over version 1.0, including improved VBR, tonality estimation, surround improvements, and speech/music classification.
This demo revisits all previous Daala demo. With pieces of Daala being contributed to the Alliance for Open Media's AV1 video codec, we go back over the demos and see what worked, what didn't, and what changed compared to the description we made in the demos.
This demo describes the new Daala deringing filter that replaces a previous attempt with a less complex algorithm that performs much better. Those who like to know all the math details can also check out the full paper.
This demonstrates the Perceptual Vector Quantization technique currently used in Daala and inspired from the Opus codec.
This demonstrates a technique that turns images into what looks like paintings. The application for this is intra prediction and deringing filtering for the Daala video codec. Because of complexity issues, it was never adopted in Daala, but inspired a new deringing filter that landed in Sep. 2015.
This shows the Spartacus robot participating in the 2005 AAAI challenge. Spartacus includes noise-robust sound localization and separation capabilities.
This is an example of separating 3 simultaneous sound sources using an array of 8 microphones.
In this video, the robot is controlled only by sound localization. It turns towards the sound source that has been present for the longest amount of time and moves towards it.