Kroos, Christian and Plumbley, Mark D. (2017) Learning the mapping function from voltage amplitudes to sensor positions in 3D-EMA using deep neural networks. In: Interspeech 2017, 20 - 24 August 2017, Stockholm, Sweden.


The first generation of three-dimensional Electromagnetic Articulography devices (Carstens AG500) suffered from occasional critical tracking failures. Although now superseded by new devices, the AG500 is still in use in many speech labs and many valuable data sets exist. In this study we investigate whether deep neural networks (DNNs) can learn the mapping function from raw voltage amplitudes to sensor positions based on a comprehensive movement data set. This is compared to arriving sample by sample at individual position values via direct optimisation as used in previous methods. We found that with appropriate hyperparameter settings a DNN was able to approximate the mapping function with good accuracy, leading to a smaller error than the previous methods, but that the DNN-based approach was not able to solve the tracking problem completely.

Link to full paper ⤧  Next post Kroos et al. (2017) ⤧  Previous post Xu et al. (2017c)