Using the Database

Having created the database template file we can now begin to put the database to use. Since we have only one level of labels for each utterance, we can only do simple sequence queries but this simple case will serve to illustrate the main features of the Emu system.

What Utterances?

Firstly we can find out what utterances are included in our database. This is a good check that the format of the template file is correct and that the directory paths have been configured correctly. For this we will use the command line tool utterances: to get a list of utterances use the command (note that here and elsewhere in this manual we give example commands for both Unix and Windows 95 where they are different):

unix% utterances test lab
windows:> utters test lab
	

This asks for a list of utterances from the test database (and so will use the template file test.tpl) using the file extension lab to search for files. The result should look like this:

msajc001
msajc002
msajc003
msajc004
msajc005
msajc006
msajc007
msajc008
msajc009
	

In this simple case the utterance names are the same as the file names without the extension part, in other databases they can be more complicated (see the section on Template Path Definitions). If you don't get a list of utterances like this, check the format of the template file, in particular the path definitions which tell Emu about the location of the database files.

A Simple Query

We can now query the database to find, for example, all instances of the vowel A. This is a simple query which can be carried out from the command line as follows[1]:

unix% emuquery test '*' 'Phonetic=A' 
windows:> emuquery test * "Phonetic=A"

In the above example, we are searching for all Phonetic level segments labelled A in all the utterances (hence the * which is a wildcard matching all utterance names) of the test database. The result should look like this:

database:this
query:Phonetic=A
type:segment
#
A       1401.15 1495.15 msajc005
A       1835.15 1964.15 msajc005
A       1647.88 1732.63 msajc008
A       3028.99 3098.2  msajc008

This is called a segment list and is a very important structure in Emu. It contains a short header giving information on the database and query that were used to construct it, followed by a list of the label, start time, end time and utterance name of each segment which satisfied the query. Start and end times are in milliseconds.

If you don't see a result like this, it means that Emu couldn't read the Phonetic label files for some reason: check that the path statement in the template file is appropriate for the directory where these files are located.

More complex queries are possible with Emu; later chapters will discuss using the graphical interface to the query tool which allows you to build up a custom utterance list, and to view the utterances from which each segment came with a click of the mouse. With a simple database like this one, only single level queries are possible (we could search for A segments followed by nasals for instance with the query [Phonetic=A -> Phonetic=n|m|N]). To take advantage of the full capabilities of Emu, we need to superimpose a hierarchical description of the utterance on the Phonetic level labels, including, for example, syllable, word and phrase level structure. These topics are covered in later chapters.

Extracting Data

To complete our simple example we will now extract speech data for each of the segments which matched the query above. To do this we must first save the segment list in a file: the easiest way is to use the -o option to emuquery:

unix% emuquery -o test.seg test '*' 'Phonetic=A' 
windows:> emuquery -o test.seg test * "Phonetic=A" 
	  

now the segment list will be written to the file test.seg instead of being printed to the screen.

Using this segment list we will now extract formant data at the midpoint of each vowel for which the command is:

unix% get_track -c 0.5 test.seg fm test
windows:> gettrack -c 0.5 test.seg fm test
	  

This command asks for data for the segment list stored in test.seg on track fm (formants) cut at the midpoint (the -c 0.5 part) with the results being stored in the file with basename test. This should produce a file test.dat which contains the following:

567.954 1470.69 2222.27 2988.12
543.007 1599.88 2362.44 3485.31
491.082 1580.24 2320.8  3390.59
536.617 1604.64 2308.16 3355.14
	  

These are the four formant values at the midpoint of each segment.

If you want to extract all the data for each segment, then you should omit the -c 0.5 option to get_track. In this case the result is two files, one containing the data, with the extension dat, and a second, with the extension tim containing three columns: the number of samples for the segment, the time of the first sample and the time of the last sample. For example if we redo the example above the file test.tim contains:

27      1404.5  1539.5
33      1839.5  2004.5
23      1649.5  1764.5
26      3029.5  3159.5

This tells us that the first segment contains 27 samples which start at time 1404.5 ms and end at time 1539.5 ms. Note that these times are not the same as those in the segment list (1401.15 and 1541.15) since they correspond to the time of the first and last samples for this track. If we extracted the sampled speech data (which has a much higher sample rate) we would get different times yet again. Emu extracts data from the first sample after the requested start time to the last sample before the requested end time.

Having extracted this data it can now be passed to your favourite analysis tool for further work. For example you could load it into a spreadsheet or data plotting package. Emu provides a set of extensions for two statistical packages (Xlisp-Stat and Splus) which are tailored to speech research. Details of these are provided in a later section.

Notes

[1]

Note that the * needs to be quoted in Unix to prevent it being expanded by the shell. It is good practice to also quote the query since it may contain embedded spaces -- note though that on Windows 95 double quotes must be used.