Welcome to the demo page for the paper “High Fidelity Compression Algorithm with Improved RVQGAN”. Here, we provide samples from our ablation studies and other competitive baselines.

Abstract

Sample pages

<aside> <img src="/icons/light-bulb_gray.svg" alt="/icons/light-bulb_gray.svg" width="40px" /> Please click on the following links to listen to more samples and view visualizations

</aside>

Comparison with leading methods

Quantizer dropout effect

Effects of balanced data-sampling

Comparison with EnCodec at 24kHz

Speech Samples

<aside> <img src="/icons/light-bulb_gray.svg" alt="/icons/light-bulb_gray.svg" width="40px" /> Note that while EnCodec simplifies the problem by downsampling the input audio to 24kHz, the proposed method works natively in the 44.1kHz domain, retaining the details and brightness of full bandwidth.

</aside>

Original

sample_9.wav

sample_21.wav

sample_30.wav

EnCodec@24kbps

08_encodec24kbps.wav

08_encodec24kbps.wav

08_encodec24kbps.wav

Ours@8kbps

03_final_nq=9.wav

03_final_nq=9.wav

03_final_nq=9.wav

Music Samples

Original

sample_103.wav

sample_166.wav

sample_202.wav

EnCodec@24kbps

08_encodec24kbps.wav

08_encodec24kbps.wav

sample_202.wav

Ours@8kbps

03_final_nq=9.wav

03_final_nq=9.wav

sample_202 (1).wav