| The Emu Speech Database System: Version 1.2 | ||
|---|---|---|
| Prev | Chapter 14. Statistics and Classification Experiments | Next |
Speech data often has high dimensionality; a single spectrum, for instance, might consist of 257 values or dimensions. This presents two kinds of problem when carrying out analyses such as the Gaussian classification analysis discussed above. Firstly the large amount of data means that processing it will take a long time, tasks such as finding the covariance of a set of data increase markedly in complexity with an increased number of dimensions. Secondly, and more importantly, a large number of dimensions in the data requires a similar model, the more dimensions in a model the more free parameters there are and hence the more data is required to train the model properly. Consequently a usefull technique is to reduce the dimensionality of the data by techniques such as principal components analysis (PCA) and canonical discriminant analysis.
Splus provides a function (prcomp) which performs PCA on a set of data which can be used to good effect on speech data extracted using Emu. PCA seeks to find a set of orthogonal dimensions such that most of the variance in the data is concentrated in the first few dimensions. The result is that it is usually possible to use a smaller number of dimensions and get the same or better classification performance than on the original untransformed data. This means that smaller models can be used and hence trained properly on smaller amounts of data.
Canonical discriminant analysis (CDA) attempts to find a transformation of the data to maximise the difference between a pre-determined set of classes. The Emu function discrim provides an interface to the cda program supplied with Emu to carry out a CDA on a matrix of data and a parallel label file. See the help file for discrim and the manual page for cda for more details.