dp Perform dynamic programming on label sequences

Table of Contents


dp Usage: [-vocab string] [-place_holder string] [-show_cost ] [-o string] [-p int] [-i float] [-d float] [-s float] [-cost_matrix string]

dp provides simple dynamic programming to find the lowest cost alignment of two symbol sequences. Possible uses include:

  • Label alignment (e.g. speech recogniser output scoring)

The costs of inserting/deleting/substituting symbols can be given either with command line options, or as a file containing a full matrix of costs. In the former case, all insertions will have the same cost. In the latter case, each row of the file contains the cost of replacing a symbol in sequence 1 with each possible symbol in the vocabulary to align with sequence 2, including a special "place holder" symbol used for insertions/deletions. See the examples below. The place holder can be redefined. The output is an EST utterance with three Relations: the first two are the input sequences, and the third shows the alignment found by dynamic programming.



string file containing vocabulary


string which vocab item is the place holder (default is )


show cost of matching path


string output file


int 'beam' width Either:


float insertion cost


float deletion cost


float substitution cost Or:


string file containing cost matrix


Align two symbol sequences:

$ dp -vocab vocab.file "a b c" "a b d c" -i 1 -d 2 -s 3

where vocab.file contains "a b c d"

Or, using a full cost matrix:

$ dp -vocab vocab2.file -cost_matrix foo "a b c" "a b d c"

where vocab2.file contains "a b c d " and the file foo contains:

0 3 3 3 2

3 0 3 3 2

3 3 0 3 2

3 3 3 0 2

1 1 1 1 0

Each row of foo shows the cost of replacing an input symbol with each symbol in the vocabulary to match an output symbol. Each row corresponds to an item in the vocabulary (in the order they appear in the vocabulary file). In the example, replacing 'a' with 'a' costs 0, replacing 'a' with any of 'b' 'c' or 'd' costs 3 (a substitution), and replacing 'a' with the place holder symbol 'null' costs 2 (a deletion). The cost of replacing 'null' with anything other than 'null' costs 1 (an insertion). The costs of 1,2 and 3 used here are only for illustration. The cost matrix meed not have the form above - for example, replacing 'a' with 'a' need not cost 0. The entries in foo are read as floats.