登入選單
返回Google圖書搜尋
Learning the Hidden Structure of Speech
註釋In the work described here, we apply the back-propagation neural network learning procedure to the analysis and recognition of speech. Because this learning procedure requires only examples of input-outputs pairs, it is not necessary to provide it with any initial description of speech features. Rather, the network develops its own set of representational features during the course of learning. A series of computer simulation studies were carried out to assess the ability of these networks to accurately label sounds; to learn to recognize sounds without labels; and to learn feature representations of continuous speech. These studies demonstrated that the networks can learn to label pre-segmented naive sounds tokens with accuracies of up to 95%. Networks trained on segmented sounds using a strategy that requires no external labels were able to recognize and delineate sounds in continuous speech. These networks developed rich internal representations that included units which corresponded to such traditional distinctions as vowels and consonants, as well as units which were sensitive to novel and non-standard features. Networks trained on a large corpus of un-segmented, continuous speech without labels also developed interesting feature representations that may be useful in both segmentation and label learning.