Week 4: Data from mechanical turk and Wiktionary interface

This week James collected a lot of audio recording for 82 most common English words from mechanical turk. I processed the collected samples to create training data for pronunciation evaluation. Steps followed for processing are:

  1. Download files from 17zuoye server and name them appropriately (word-count, or file-id as per the csv record).
  2. Adjust the level boost in mp3 and convert it to wav format.
  3. Generate prompt files for each sample using previously written code.
  4. Decode each file using -align, -neighbor and -words JSGF grammar.
  5. Generate standards.txt to get acoustic scores for each word (pending)
  6. Generate sliced data for DNN training (pending)

Apart from this I also worked on getting the single line recording interface to Wiktionary page. Here's a screenshot of current UI.



Here are the publicly available scripts which are used to modify the already existing audio widget on Wiktionary. This will eventually appear as a gadget.

Wiktionary page used as a sandbox (modifications are done only for my user so it won't be visible publicy):
https://en.wiktionary.org/wiki/User:Brijsri/pronunce

Javascript:
https://en.wiktionary.org/wiki/User:Brijsri/common.js

CSS:
https://en.wiktionary.org/wiki/User:Brijsri/common.css


I am still working on getting the interactive recording apparatus ready using Matt Diamond's recorder.js.

See you soon with more updates!

Comments

Popular Posts