Hidden markov model example ppt

hidden markov model ppt slides and also hidden markov model artificial intelligence ppt
Dr.BenjaminClark Profile Pic
Dr.BenjaminClark,United States,Teacher
Published Date:21-07-2017
Your Website URL(Optional)
Comment
Markov processes and Hidden Markov Models (HMMs) www.ThesisScientist.comMotivation • The Bayes nets we considered so far were static: they referred to a single point in time – E.g., medical diagnosis • Agent needs to model how the world evolves – Speech recognition software needs to process speech over time – Artificially intelligent software assistant needs to keep track of user’s intentions over time – … … … www.ThesisScientist.comMarkov processes • We have time periods t = 0, 1, 2, … • In each period t, the world is in a certain state S t • The Markov assumption: given S , S is t t+1 independent of all S with i t i – P(S S , S , …, S ) = P(S S ) t+1 1 2 t t+1 t – Given the current state, history tells us nothing more about the future S S S S …… 1 2 3 t • Typically, all the CPTs are the same: www.ThesisScientist.com • For all t, P(S = j S = i) = a (stationarity assumption) t+1 t ij Weather example • S is one of s, c, r (sun, cloudy, rain) t • Transition probabilities: .6 s .1 .3 .4 not a Bayes net .3 .2 .3 c .3 r .5 • Also need to specify an initial distribution P(S ) 0 • Throughout, assume P(S = s) = 1 0 www.ThesisScientist.comWeather example… .6 s .1 .3 .4 .3 .2 .3 c .3 r .5 • What is the probability that it rains two days from now? P(S = r) 2 • P(S = r) = P(S = r, S = r) + P(S = r, S = s) + 2 2 1 2 1 P(S = r, S = c) = .1.3 + .6.1 + .3.3 = .18 2 1 www.ThesisScientist.comWeather example… .6 s .1 .3 .4 .3 .2 .3 c .3 r .5 • What is the probability that it rains three days from now? • Computationally inefficient way: P(S = r) = P(S = 3 3 r, S = r, S = r) + P(S = r, S = r, S = s) + … 2 1 3 2 1 n-1 • For n periods into the future, need to sum over 3 paths www.ThesisScientist.comWeather example… .6 s .1 .3 .4 .3 .2 .3 c .3 r .5 • More efficient: • P(S = r) = P(S = r, S = r) + P(S = r, S = s) + P(S = r, 3 3 2 3 2 3 S = c) = P(S = r S = r)P(S = r) + P(S = r S = s)P(S 2 3 2 2 3 2 2 = s) + P(S = r S = c)P(S = c) 3 2 2 • Only hard part: figure out P(S ) 2 • Main idea: compute distribution P(S ), then P(S ), then 1 2 P(S ) 3 • Linear in number of periods www.ThesisScientist.com example on boardStationary distributions • As t goes to infinity, “generally,” the distribution P(S ) will converge to a stationary distribution t • A distribution given by probabilitiesπ (where i is a i state) is stationary if: P(S = i) =π means that P(S = i) =π t i t+1 i • Of course, P(S = i) = Σ P(S = i, S = j) = Σ P(S = j) a t+1 j t+1 t j t ji • So, stationary distribution is defined by π = Σ π a i j j ji www.ThesisScientist.comComputing the stationary distribution .6 s .1 .3 .4 .3 .2 .3 c .3 r .5 • π = .6π + .4π + .2π s s c r • π = .3π + .3π + .5π c s c r • π = .1π + .3π + .3π r s c r www.ThesisScientist.comRestrictiveness of Markov models • Are past and future really independent given current state? • E.g., suppose that when it rains, it rains for at most 2 days S S S S … 3 4 1 2 • Second-order Markov process • Workaround: change meaning of “state” to events of last 2 days S , S S , S S , S S , S 3 4 4 5 2 3… 1 2 • Another approach: add more information to the state • E.g., the full state of the world would include whether the sky is full of water – Additional information may not be observable www.ThesisScientist.com – Blowup of number of states…Hidden Markov models (HMMs) • Same as Markov model, except we cannot see the state • Instead, we only see an observation each period, which depends on the current state S S S S …… 1 2 3 t O O O O …… 3 t 1 2 • Still need a transition model: P(S = j S = i) = a t+1 t ij • Also need an observation model: P(O = k S = i) = b t t ik www.ThesisScientist.comWeather example extended to HMM • Transition probabilities: .6 s .1 .3 .4 .3 .2 .3 c .3 r .5 • Observation: labmate wet or dry • b = .1, b = .3, b = .8 sw cw rw www.ThesisScientist.comHMM weather example: a question .6 b = .1 sw s b = .3 .1 cw .3 .4 b = .8 .3 rw .2 .3 c .3 r .5 • You have been stuck in the lab for three days () • On those days, your labmate was dry, wet, wet, respectively • What is the probability that it is now raining outside? • P(S = r O = d, O = w, O = w) 2 0 1 2 www.ThesisScientist.com • By Bayes’ rule, really want to know P(S , O = d, O = w, O = w) 2 0 1 2 Solving the question .6 b = .1 sw s b = .3 .1 cw .3 .4 b = .8 .3 rw .2 .3 c .3 r .5 • Computationally efficient approach: first compute P(S = i, O = d, O = w) for all states i 1 0 1 • General case: solve for P(S , O = o , O = o , …, O t 0 0 1 1 t = o ) for t=1, then t=2, … This is called monitoring t • P(S , O = o , O = o , …, O = o ) = Σ P(S = s , s t 0 0 1 1 t t t-1 t-1 t-1 O = o , O = o , …, O = o ) P(S S = s ) P(O = 0 0 1 1 t-1 t-1 t t-1 t-1 t www.ThesisScientist.com o S ) t tPredicting further out .6 b = .1 sw s b = .3 .1 cw .3 .4 b = .8 .3 rw .2 .3 c .3 r .5 • You have been stuck in the lab for three days • On those days, your labmate was dry, wet, wet, respectively • What is the probability that two days from now it will be raining outside? • P(S = r O = d, O = w, O = w) 4 0 1 2 www.ThesisScientist.comPredicting further out, continued… .6 b = .1 sw s b = .3 .1 cw .3 .4 b = .8 .3 rw .2 .3 c .3 r .5 • Want to know: P(S = r O = d, O = w, O = w) 4 0 1 2 • Already know how to get: P(S O = d, O = w, O = w) 2 0 1 2 • P(S = r O = d, O = w, O = w) = 3 0 1 2 Σ P(S = r, S = s O = d, O = w, O = w) s 3 2 2 0 1 2 2 Σ P(S = r S = s )P(S = s O = d, O = w, O = w) s 3 2 2 2 2 0 1 2 2 • Etc. for S 4 • So: monitoring first, then straightforward Markov process www.ThesisScientist.com updatesIntegrating newer information .6 b = .1 sw s b = .3 .1 cw .3 .4 b = .8 .3 rw .2 .3 c .3 r .5 • You have been stuck in the lab for four days () • On those days, your labmate was dry, wet, wet, dry respectively • What is the probability that two days ago it was raining outside? P(S = r O = d, O = w, O = w, O 1 0 1 2 3 = d) www.ThesisScientist.com – Smoothing or hindsight problemHindsight problem continued… .6 b = .1 sw s b = .3 .1 cw .3 .4 b = .8 .3 rw .2 .3 c .3 r .5 • Want: P(S = r O = d, O = w, O = w, O = d) 1 0 1 2 3 • “Partial” application of Bayes’ rule: P(S = r O = d, O = w, O = w, O = d) = 1 0 1 2 3 P(S = r, O = w, O = d O = d, O = w) / 1 2 3 0 1 P(O = w, O = d O = d, O = w) 2 3 0 1 • So really want to know P(S , O = w, O = d O = d, O = w) 1 2 3 0 1 www.ThesisScientist.comHindsight problem continued… .6 b = .1 sw s b = .3 .1 cw .3 .4 b = .8 .3 rw .2 .3 c .3 r .5 • Want to know P(S = r, O = w, O = d O = d, O = w) 1 2 3 0 1 • P(S = r, O = w, O = d O = d, O = w) = 1 2 3 0 1 P(S = r O = d, O = w) P(O = w, O = d S = r) 1 0 1 2 3 1 • Already know how to compute P(S = r O = d, O = w) 1 0 1 • Just need to compute P(O = w, O = d S = r) 2 3 1 www.ThesisScientist.comHindsight problem continued… .6 b = .1 sw s b = .3 .1 cw .3 .4 b = .8 .3 rw .2 .3 c .3 r .5 • Just need to compute P(O = w, O = d S = r) 2 3 1 • P(O = w, O = d S = r) = 2 3 1 Σ P(S = s , O = w, O = d S = r) = s 2 2 2 3 1 2 Σ P(S = s S = r) P(O = w S = s ) P(O = d S = s ) s 2 2 1 2 2 2 3 2 2 2 • First two factors directly in the model; last factor is a “smaller” problem of the same kind • Use dynamic programming, backwards from the future www.ThesisScientist.com – Similar to forwards approach from the past