Skip to main content
To KTH's start page To KTH's start page

Wikispeech2 – A speech corpus collector for a more accessible Wikipedia through Wikispeech

The speech corpus collector is an initiative of Wikimedia Sweden, KTH and STTS AB to build a valuable resource for speech technology in Swedish and to make Wikipedia more accessible. the purpose is to develop a set of tools to help using Wikipedia volunteers collect large quantities freely licensed recordings in Swedish and associated annotations.

The voice resource collector will also be connected to the various platforms that is run by Wikimedia to make the crowdsourced material available automatically to a large audience. The recordings will be made available on Wikimedia Commons, the media database used by Wikipedia. Parts of the audio recordings will be added where they can be used in Wikipedia to exemplify pronunciations of, for example, article titles or from recordings of entire articles created by volunteers.

The audio recordings will also be linked to relevant articles on Wikipedia, the free one encyclopedia, as well as words on Wiktionary, the free dictionary, to illustrate how the word pronounced. In addition, the recordings will be linked to Wikidata, a structured database that is widely available to search and contribute to. All this data on Wikidata and Wikimedia Commons will be stored on and is made available through Wikimedia Foundation's servers at no cost to the project

The project will develop the following.

  • Recorder
    • Show prompt
    • Record speech
    • Play speech
  • Automatic validator
    • Signal quality validation
    • Normalization of sound level
    • Sanity check
  • Script creator
    • Script input
    • Script generator
  • Storage
    • Store recordings
    • Retrieve recording data
    • Upload data to permanent storage platforms
    • Discard recording data
    • Import speech data
  • Examiner
    • Speech recording selector
    • Show prompt
    • Playback recording
    • Rate recording
  • Annotator
    • Segmentation
    • Phonetic transcription
    • Converter for annotation format / standard
  • Publishers
    • Add pronunciation in Wikipedia article
    • Add pronunciation for words in Wiktionary
    • Add data to Wikidata
  • Exporter
    • Download as archive

Joakim Gustafson (Project leader)

Funding: PTS - Post och Telestyrelsen

Duration: 2019 - 2022