python - Error in my implementation of the Baum-Welch algorithm -
i having trouble implementing discrete hmm in python using numpy. have been following guide, implement sum-exp-log trick in addition (this not implemented training , b wasn't sure how trick applied process. implemented forwards , backwards probabilities).
i forwards , backwards probability calculations correct because verified results “toy” data used in this explanation of forwards-backwards algorithm.
when using baum-welch train transition , emission matrices (a , b in code respectively), run trouble. is, must set laplace smoothing parameter ridiculously low value. if set 0, emission , transition matrices filled nans after few iterations (normally warning divide 0 encountered in numpy.log). if set laplace smoothing parameter high, transition , emission matrices not change @ each iteration. when appropriate laplace smoothing value chosen avoids both issues mentioned, hmm not converge – probability of training set decreases each iteration, hops higher probability (still small one), , continues decrease again. also, transition matrix, a, ends being filled exact same value. if not update value of emission matrix, b, not run these issues – probability of training set increases @ each training iteration, not nans in transition matrix, nor become filled exact same value. such, assume issue somewhere in function step_b(). here relevant code, have choose laplace smoothing value of 10^-300 not run problems mentioned training sequence length of 100:
def train(self, max_iter): iter in range(0, max_iter): self.train_step() def train_step(self): alphas = self.calc_forwards(self.x, as_log = false) betas = self.calc_backwards(self.x, as_log = false) gammas = self.calc_gammas(self.x, alphas, betas) self.a = self.step_a(gammas) self.b = self.step_b(gammas, self.x) prob_of_x = self.probability_of_sequence(self.x) print("prob of x: ", prob_of_x) def step_a(self, gammas): a_new = np.zeros(self.a.shape, dtype = np.float64) in range(0, a_new.shape[0]): j in range(0, a_new.shape[1]): a_new_numerator = np.sum(gammas[:,i,j]) a_new_denominator = np.sum(np.sum(gammas[:,i,:], axis = 0)) a_new[i,j] = (a_new_numerator + self.laplace_smooth_param)/(a_new_denominator + self.laplace_smooth_param * self.num_states) return a_new def step_b(self, gammas, x): b_new = np.zeros(self.b.shape, dtype = np.float64) j in range(0, b_new.shape[0]): k in range(0, b_new.shape[1]): numerator_sum_coefficients = (x == k).astype(np.int) b_new_numerator = np.sum(np.sum(gammas[:,:,j] * numerator_sum_coefficients[:,np.newaxis], axis = 0)) b_new_denominator = np.sum(np.sum(gammas[:,:,j], axis = 0)) b_new[j,k] = (b_new_numerator + self.laplace_smooth_param)/(b_new_denominator + self.laplace_smooth_param * self.num_observables) return b_new def calc_gammas(self, x, alphas, betas): gammas = np.zeros((alphas.shape[0],) + self.a.shape, dtype = np.float64) t in range(0, gammas.shape[0]): in range(0, gammas.shape[1]): j in range(0, gammas.shape[2]): gammas[t,i,j] = alphas[t,i] * self.a[i,j] * self.b[j,x[t]] * betas[t+1,j] return gammas
Comments
Post a Comment