USING CONTEXT SPECIFIC GENERATIVE ADVERSARIAL NETWORKS FOR AUDIO DATA COMPLETION: MUSICAL INSTRUMENTS CASE STUDY
Abstract
Audio quality plays an essential role in several applications ranging from music to voice conversations. Sound information is subject to quality loss caused by reasons such as intermittent network connections, or storage corruption. Recent approaches resorted to using GANs for audio reconstruction due to their successful deployment in visual applications. However, more often than not audio datasets include sounds from different contexts which increase the complexity of the patterns to be learned, leading to sub-optimal quality reconstruction. We propose a novel audio completion pipeline which clusters audio based on similarity and trains a dedicated specialized GAN for each context separately. The proposed technique is compared with the traditional method of training one general GAN in completing 200ms missing segments of 1 second audio samples. Experimental results on a public benchmark dataset show that using specialized GANs led to a clear improvement in the completion quality while reducing training convergence times.
DOI/handle
http://hdl.handle.net/10576/45075Collections
- Computing [100 items ]