Ellipse plotting

Ellipse plotting is useful for examining the distribution on two acoustic parameters of different kinds of segments. For example, to make an ellipse plot for the segments used in the previous examples on the parameters F1 and F2 we do the following:


eplot(vals.f[,1:2], l.segs, formant=T, centroid=T) 

    

Figure 11.2. An ellipse plot.

An ellipse plot.

The result is shown in Figure 11.2, “An ellipse plot.”. Note that the first argument to eplot must be a two column matrix of data. In the example above this is a subset of the data returned by the emu.track function. To plot, say, the F1 values against the durations for each segment we would need to construct a two column matrix with the cbind function:


durvals <- mudur(segs)                       # get segment durations
dur.f.vals <- cbind(vals.f[,1], durvals)     # make a 2 column matrix
eplot(dur.f.vals, l.segs)

     

There are a number of optional arguments to eplot which are documented in . Some of these options are discussed here.

The ellipses can be annotated with one label at the centroid of each ellipse using the centroid=T option as in the first example. Alternatively, all of the points can be plotted using their label as the plot character (dopoints=T). The size of the ellipse plotted for each group is determined by the nsdev argument. With the default value of 2.447747 each ellipse has a radius of 2.447747 standard deviations which will include roughly 95% of the data points. Smaller values of nsdev will obviously give smaller ellipses.

The axes defaults to TRUE which means that axes will be drawn on the plot. If axes is set to FALSE axes will not be drawn, this enables ellipses of larger/smaller standard deviations to be superimposed on the same plot. When doing this, make sure that the axis ranges are set to the same values for successive plots. (Use xlim and ylim). Use par(new=T) after each plot to superimpose plots on the same display. For example, the following instructions superimpose ellipses of 1, 1.5, and 2 standard deviations on the same data, the result is shown below..

eplot(dur.f.vals, l.segs, nsdev=1, axes=F, xlim=c(200, 500), ylim=c(0, 200)) 
par(new=T)
eplot(dur.f.vals, l.segs, nsdev=1.5, axes=F, xlim=c(200, 500), ylim=c(0, 200))
par(new=T) 
eplot(dur.f.vals, l.segs, nsdev=2,  xlim=c(200, 500), ylim=c(0, 200), 
        dopoints=T, xlab="F1", ylab="duration (ms)")

The formant argument is used to invert the formant axes to create a plot from F1 and F2 frequencies in the formant plane. For example, compare the following two types of plot:

par(mfrow=c(1,2))                # partition the plot window
vowsegs <- emu.query("demo", "*", "Phonetic=i:|o:|A") 
l.vowsegs <- label(vowsegs)
formdat <- emu.track(vowsegs, "fm", cut=0.5) 
eplot(formdat[,1:2], l.vowsegs, dopoints=T)             
eplot(formdat[,1:2], l.vowsegs, dopoints=T, formant=T) 

Both plots are equivalent. The one on the left is of three vowels in the F1/F2 plane. The plot on the right rotates and inverts the plot so that the origin is in the top right hand corner and F1 is plotted on the y-axis and F2 on the x-axis (the default labels are also set to F1 and F2). Now the position of the vowels corresponds roughly with their expected position in the vowel quadrilateral i.e. with [i:] vowels in the top left corner, [A] vowels (an open vowel in Australian English) towards the bottom and [o:] vowels (a close back vowel in Australian English) in the top right corner.

Finally, the classify argument performs a Gaussian classification on all the tokens: a calculation is made of the Mahalanobis distance of each token to the ellipse centroids; the label of the closest centroid is returned. The results of the classification are returned as a confusion matrix. (Further details on Gaussian classification are given in .)

eplot(formdat[,1:2], l.vowsegs, dopoints=T, scaling="bark", classify=T)

      
   o: i:  A 
o: 14  0  0
i:  0 36  4
 A  0  0 15
      
     

The confusion matrix shows that 36 [i:] tokens are (probabilistically) closest to the centroid of the [i:] ellipse, and 4 are closest to the centroid of the [A] ellipse.