Suppose you are modeling text with a HMM, What is the complexity of finding most the probable sequence of tags or states from a sequence of text using brute force algorithm?

  • Assume there are total K states and let T be the length of the largest sequence.
  • Think how we generate text using an hMM. We first have a state sequence and from each state we emit an output. From each state, any word out of T possible outcomes can be generated. Since there are K states, at each possible state in the sequence, there are T choices of generating a word. Hence, there are T^{K} possible sequences
  • Therefore, complexity using any brute force algorithm is O(T^{K}) as we would explore probability of each sequence to find the maximum for most probable one. This is clearly not scalable. 
  • In practice, this problem is solved using the viterbi algorithm using dynamic programming.

Leave a Reply

Your email address will not be published. Required fields are marked *