Sangam: A Confluence of Knowledge Streams

## SUPERSEDED: THIS DATASET HAS BEEN REPLACED. ## Noisy speech database for training speech enhancement algorithms and TTS models

Show simple item record

dc.contributor EPSRC - Engineering and Physical Sciences Research Council
dc.contributor Valentini-Botinhao, Cassia
dc.creator Valentini-Botinhao, Cassia
dc.date 2016-03-22T11:04:35Z
dc.date 2016-03-22T11:04:35Z
dc.date.accessioned 2023-02-17T20:51:25Z
dc.date.available 2023-02-17T20:51:25Z
dc.identifier Valentini-Botinhao, Cassia. (2016). Noisy speech database for training speech enhancement algorithms and TTS models, [dataset]. University of Edinburgh. School of Informatics. Centre for Speech Technology Research (CSTR). https://doi.org/10.7488/ds/1356.
dc.identifier https://hdl.handle.net/10283/1942
dc.identifier https://doi.org/10.7488/ds/1356
dc.identifier.uri http://localhost:8080/xmlui/handle/CUHPOERS/243881
dc.description ## SUPERSEDED: THIS DATASET HAS BEEN REPLACED by the one which can be found at https://doi.org/10.7488/ds/2117. ## Clean and noisy parallel speech database. The database was designed to train and test speech enhancement methods that operate at 48kHz. A more detailed description can be found in the paper associated with the database. Some of the noises were obtained from the Demand database, available here: http://parole.loria.fr/DEMAND/ The speech database was obtained from the Voice Banking Corpus, available here: http://homepages.inf.ed.ac.uk/jyamagis/release/VCTK-Corpus.tar.gz
dc.description The files are wav format audio data sampled at 48kHz. Each file contains a sentence recorded by a range of speakers in quiet studio conditions. This audio material was added to a range of different noise signals, constituting the parallel noisy dataset. Accompanying each audio file there is a text file containing the orthographic transcription of what was said in that particular audio sample.
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.format application/zip
dc.language eng
dc.publisher University of Edinburgh. School of Informatics. Centre for Speech Technology Research (CSTR)
dc.relation Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki and Junichi Yamagishi. 2016. "Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks" in Interspeech 2016.
dc.relation https://doi.org/10.7488/ds/2117
dc.rights Creative Commons Attribution 4.0 International Public License
dc.source http://parole.loria.fr/DEMAND/
dc.source http://homepages.inf.ed.ac.uk/jyamagis/release/VCTK-Corpus.tar.gz
dc.subject noisy speech
dc.subject speech enhancement
dc.subject speech synthesis
dc.subject Voice Bank Corpus
dc.subject Demand Corpus
dc.subject Mathematical and Computer Sciences::Speech and Natural Language Processing
dc.title ## SUPERSEDED: THIS DATASET HAS BEEN REPLACED. ## Noisy speech database for training speech enhancement algorithms and TTS models
dc.type dataset


Files in this item

Files Size Format View
clean_testset_wav.zip 154.3Mb application/zip View/Open
clean_trainset_wav.zip 861.5Mb application/zip View/Open
noisy_testset_wav.zip 170.5Mb application/zip View/Open
noisy_trainset_wav.zip 957.1Mb application/zip View/Open
testset_txt.zip 370.4Kb application/zip View/Open
trainset_txt.zip 6.223Mb application/zip View/Open

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse