Week 4: Data from mechanical turk and Wiktionary interface
This week James collected a lot of audio recording for 82 most common English words from mechanical turk. I processed the collected samples to create training data for pronunciation evaluation. Steps followed for processing are:
- Download files from 17zuoye server and name them appropriately (word-count, or file-id as per the csv record).
- Adjust the level boost in mp3 and convert it to wav format.
- Generate prompt files for each sample using previously written code.
- Decode each file using -align, -neighbor and -words JSGF grammar.
- Generate standards.txt to get acoustic scores for each word (pending)
- Generate sliced data for DNN training (pending)
Apart from this I also worked on getting the single line recording interface to Wiktionary page. Here's a screenshot of current UI.
Here are the publicly available scripts which are used to modify the already existing audio widget on Wiktionary. This will eventually appear as a gadget.
Wiktionary page used as a sandbox (modifications are done only for my user so it won't be visible publicy):
https://en.wiktionary.org/wiki/User:Brijsri/pronunce
Javascript:
https://en.wiktionary.org/wiki/User:Brijsri/common.js
CSS:
https://en.wiktionary.org/wiki/User:Brijsri/common.css
I am still working on getting the interactive recording apparatus ready using Matt Diamond's recorder.js.
See you soon with more updates!
Comments
Post a Comment