代做(520|600).666 Information Extraction Homework # 6调试R语言程序
- 首页 >> OS编程(520j600).666 Information Extraction
Homework # 6
Due Friday, April 26, 2024.
Connectionist Temporal Classiication
Consider the task of recognizing an M length sequence of tokens, y1(M), from a T length input
x1(T). The CTC objective function is one objective for such sequence transduction tasks which
works by converting y1(M) into an alignment sequence s1(T) of the same length as x1(T) by using
an additional symbol to ill the extra space. Note that it assumes T ≥ M. β -1 (y1(M))
represents the set of all possible T-length alignments, s1(T), for the sequence y1(M) .
We will consider that p (st jx1(T))is obtained by normalizing the outputs of a neural network,
φ . Element, φs(t)t / p (st jx1(T)), of the matrix, φ, can be used to compute the CTC objective
where
and jSj is the number of output units in the neural network.
1. WFST Representation of the CTC Objective
Consider the task of modeling the word “all” using characters as the tokens. In this case M = 3. For this question, we will model words using both the Hybrid DNN-HMM, and CTC models and examine the diferent unweighted (i.e., the weights on each arc are 1.0) WFST topologies used for each. Remember that we can represent an HMM as a WFST. Remember that inal states are indicated by using a double circle instead of a single circle around the state label. Mark all the input and output labels.
(a) Draw the WFST for the word “all” corresponding to using a 1-state HMM for each letter and uniform state transitions. Assume the output labels could be fed into a pronunciation lexicon to recover the word, i.e., a - l - l ! all. In this case each state roughly represents a letter. The arcs accept symbols that the neural network could predict and produce the letters such that for any alignment of symbols a, l, l, i.e., a a a l l l l l, the letters, a-l-l, corresponding to the word “all” will be produced.
(b) Draw the WFST corresponding to the CTC topology. Recall that it should be able to accept strings starting or ending with .
(c) Enumerate all possible length-5 alignments accepted by the “all” HMM topol- ogy, e.g., a a l l would be a length-4 sequence.
(d) Enumerate all possible length-5 alignments of the word “all” accepted by the CTC topology
(e) Draw the ”HMM” trellis, often called a lattice, corresponding to all possible length 5 sequences, but as a WFST. (It should still look very similar to a normal trellis). Ignore the weights on the arcs. Do not draw any unnecessary (unused) nodes. Please include input and output labels.
(f) Repeat (e) but for the CTC trellis, corresponding to all possible length 5 se- quences as a WFST.
2. WFST Composition
The outputs of a neural network, φ, can themselves be represented asa WFST. Imagine the neural network has 4 outputs. In other words, it outputs vectors φt 2 R1 x 4 . The irst output corresponds to the 含 symbol, the second symbol corresponds to a, the third symbol to b and the fourth to l.
For this problem consider the matrix of scores, φ, where each column corresponds to a time index, and each row corresponds to one of the network outputs. These are like to observation probabilities in HMMs. We can draw this matrix as a WFST with 6 states corresponding to the length of the neural network output (5+1, where +1 is for a start state). Between each node there are as many arcs as there are output symbols (i.e., rows in the matrix). The input and output labels for this WFST will be the same (i.e., it is a WFSA), and the weights on the arcs correspond to the scores of those symbols at that position in time. Use the following matrix.
(a) Draw this WFST. Feel free to use some shorthand notation provided it is logical if this process seems tedious. We will call it Φ .
(b) Use the WFST, Φ, from part (a) and compose it, with the CTC WFST, which we will call T, from problem (1.) using the log-semiring, i.e., Φ 。T. Assume unweighted arcs all had a weight of 1. Recall that in the log-semiring, a b = - loge-a + e-b , a b = a + b, 1 = 0, and 0 = 1. Show each step of the composition by denoting states with the pairs of labels on states used in Φ and T, like (Φi , Tj ). Feel free to use any computational aid, i.e. a program or calculator, to help. Remember to remove any branches that don’t end in inal states. A inal state occurs when the both states in the pair of WFSTs are inal.
(c) Compare the resulting WFST with the CTC trellis computed in problem (1.f) and comment.
(d) Bonus: Use the forward algorithm on the result of part (b) to compute the CTC objective for this utterance. Again feel free to use any computational aid.