Week 8: Generic feature extraction frontend in Wiktionary and Alignment API in pocketsphinx.js
This week we have several new additions to Wiktionary audio feature extraction for pronunciation. The following summarize the additions:
- Dynamic generation of alignment, all_phone and other_phone grammar for words present in dictionary (constants.js).
- Alignment grammar predicts the actual phonemes in a word.
- All_phone grammar creates a triphone grammar replacing the middle phoneme with all the possible phonemes in dictionary including the target phoneme (eg: phone1 [phoneA | phoneB | ... | phoneN] phone2).
- Other_phone grammar creates a triphone grammar replacing the middle phoneme with all phones except the target phoneme.
- Implemented 3 stage processing of audio to extract features proposed for authentic pronunciation intelligibility (common.js).
- Implementing Pocketsphinx's State (word and phone) alignment search into pocketsphinx.js (WIP).
Progress:
- Built pocketsphinx.js using Emscripten
- Modified the recognizer to accept 'alignment' as a processing job
- Modifying the recognizer Javascript interface to post the audio as a webworker job
Comments
Post a Comment