In this section, we provide reconstruction results from each domain with the intent of illustrating the impact of quantizer dropout. Additionally, we present an animation below that depicts the spectrogram of the reconstructed samples at varying bitrates. Each row corresponds to a different model trained with the labelled dropout rate.

<aside> <img src="/icons/exclamation-mark_gray.svg" alt="/icons/exclamation-mark_gray.svg" width="40px" /> Due to a high number of samples, this page may load slowly for some users.

</aside>

Animation demonstrating quantizer-dropout effect

Spectrogram of reconstructed audio with increasing number of quantizers. Each row corresponds to a different model trained with the labelled quantizer-dropout rate. Samples chosen here belong to the first columns of the respective domains in the audio section below.

Spectrogram of reconstructed audio with increasing number of quantizers. Each row corresponds to a different model trained with the labelled quantizer-dropout rate. Samples chosen here belong to the first columns of the respective domains in the audio section below.

<aside> <img src="/icons/exclamation-mark_gray.svg" alt="/icons/exclamation-mark_gray.svg" width="40px" /> The above image is a GIF file. A double click on the image will open it and play the animation again.

</aside>

Samples

We present reconstructed audio samples from models trained with different quantizer dropout rate. The reconstructions are computed at full bandwidth (8 kbps).

Speech


Original Audio

sample_9.wav

sample_21.wav

sample_30.wav

sample_102.wav

sample_165.wav

Quantizer dropout rate

dropout=0.0

dropout=0.25

dropout=0.5

dropout=1.0 (baseline)

Reconstructed Audio

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

Music


Original Audio

sample_10.wav

sample_22.wav

sample_31.wav

sample_103.wav

sample_166.wav

Quantizer dropout rate

dropout=0.0

dropout=0.25

dropout=0.5

dropout=1.0 (baseline)

Reconstructed Audio

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

00_ablations_q_dropout-0.0.wav

01_ablations_q_dropout-0.25.wav

02_ablations_q_dropout-0.5.wav

03_ablations_baseline.wav

Environmental Sounds