Week 8: Generic feature extraction frontend in Wiktionary and Alignment API in pocketsphinx.js

This week we have several new additions to Wiktionary audio feature extraction for pronunciation. The following summarize the additions:

  1. Dynamic generation of alignment, all_phone and other_phone grammar for words present in dictionary (constants.js). 
    • Alignment grammar predicts the actual phonemes in a word. 
    • All_phone grammar creates a triphone grammar replacing the middle phoneme with all the possible phonemes in dictionary including the target phoneme (eg: phone1 [phoneA | phoneB | ... | phoneN] phone2). 
    • Other_phone grammar creates a triphone grammar replacing the middle phoneme with all phones except the target phoneme.
  2. Implemented 3 stage processing of audio to extract features proposed for authentic pronunciation intelligibility (common.js).
  3. Implementing Pocketsphinx's State (word and phone) alignment search into pocketsphinx.js (WIP).

Progress:
  • Built pocketsphinx.js using Emscripten
  • Modified the recognizer to accept 'alignment' as a processing job
  • Modifying the recognizer Javascript interface to post the audio as a webworker job

Comments

Popular Posts