## SUPERSEDED: THIS DATASET HAS BEEN REPLACED by the one which can be found at https://doi.org/10.7488/ds/2117. ## Clean and noisy parallel speech database. The database was designed to train and test speech enhancement methods that operate at 48kHz. A more detailed description can be found in the paper associated with the database.
Some of the noises were obtained from the Demand database, available here: http://parole.loria.fr/DEMAND/
The speech database was obtained from the Voice Banking Corpus, available here: http://homepages.inf.ed.ac.uk/jyamagis/release/VCTK-Corpus.tar.gz
The files are wav format audio data sampled at 48kHz. Each file contains a sentence recorded by a range of speakers in quiet studio conditions. This audio material was added to a range of different noise signals, constituting the parallel noisy dataset. Accompanying each audio file there is a text file containing the orthographic transcription of what was said in that particular audio sample.