Markov Random Processes

In the examples of random processes that we have considered so far, particularly those examples of discrete random sequences, each value of the sequence was independent of the other sequence values that preceded it. This means that each value of the sequence is mutually independent of all other values. This is not always the case since sometimes successive values of a process or sequence are dependent. One discrete random sequence in this class that has particular significance is known as a Markov Process.

Definition

A Markov Process is a random process where the occurrence of an event is dependent only upon the immediately preceding event. Alternatively, a random process is a Markov Process if the future values of the process, given the present, are independent of the past. Markov processes can be used as models for both discrete-time and continuous-time signals. A Markov Chain Process is a random process that has an initial state and a finite or countably infinite set of possible states.

Some terminology

Markov processes or Markov chains are said to be in one of a set of mutually exclusive, collectively exhaustive states typically denoted as s1, s2, s3, etc... (it is important not to get this confused with the elements of a sample space).

For discrete processes, the transition from one state to another occurs at discrete time instants and is described by a set of transition probabilities.

Transition probabilities are conditional and the transition probabilities from the current state to each possible next state depend only on the current state.

An illustrative example

Without further definitions we can think about some familiar processes within this new framework. Suppose we have a signal that takes on only two possible values. Does this sound familiar yet? Let's say these values are +1 or -1 and the signal or process switches between these values at a regular sample time based on some event. We'll assume the process is a discrete-time process and we can define some probabilities that the signal or process will change from +1 to -1 or visa-versa.

State

Signal Value

P(transition to +1)

P(transition to -1)

Sum of Transition Probabilities

1

+1

a=0.3

1-a=0.7

1

2

-1

1-b=0.6

b=0.4

1

 

We can represent this process diagramatically using a state transition diagram or trellis diagram shown on the class handout. Basically, a and b are the probabilities that the process will stay in its current state. Similarly, (1-a) and (1-b) are the probabilities that the process will change state. To get a better handle on the ideas here we can simulate this Markov process in MATLAB. For those of you who are familiar with object-oriented programming ideas, this type of programming might be a better option for this type of process simulation. Using your simulation, briefly explore the effect of changing the values of the state transition probabilities a and b.

Conditional probabilities

Before we get too detailed here, it is appropriate for us to review what we mean by conditional probability since the transitions from state to state in a Markov process are conditioned by the current state. This takes us all the way back to chapter 2 of the text (Pgs. 28 - 30). Basically, we define conditional probability in terms of related events that somehow depend on one another. Given two events A and B, there are two conditional probabilities. (i) What is the probability that event B occurs if event A has already occured, and (ii) What is the probability that event A occurs given that event B has already occurred? These are expressed mathematically as,

 

Where P(AB) is the probability that events A and B occur simultaneously. If P(A|B) = P(A) and P(B|A) = P(B), then P(AB)=P(A)P(B) and the events are termed as mutually or statistically independent.

This idea of conditional probability is developed further in terms of probability density and distribution functions. In some applications there is more than one random variable and the probability density and distribution functions may be functions of each variable. The definition of the conditional cumulative probability distribution and the related probability density functions are defined in terms similar to that above.

 

In a similar manner, statistical independence means that the conditional distribution and density functions are equal to the marginal density functions. These can be calculated from the joint pdf (Pg. 64,65) and lead to the following important result.

 

Finally, the expected value of functions of more than one random variable are determined in the familiar way.

 

Several important expectations provide commonly used results.

 

Some familiar examples (I think?)

Example 1: A Sum or Counting Process

Consider a sum or counting process that sequentially adds values of a random variable. This type of process sequentially sums a sequence of independent identically distributed (iid) random variables. These could be of any type - exponential, Gaussian, uniform, etc..... The sum process is represented mathematically as,

 

This process is MARKOV since Sn depends on the "past", S1, S2, ....Sn-1, only through the previous value Sn-1. That is, Sn is independent of the past if Sn-1 is known. Probabalistically, this means that,

 

This is similar in some respects to the Poisson event counting process that we developed in the previous assignment although there we simply added 1 every time an event occured.

Example 2: A Moving Average Process

As an example of a process that is NOT Markov, consider the moving average of a Bernoulli Process or Bernoulli Sequence. The generation of this type of process was illustrated earlier so we'll just revisit it for a few moments.

The Bernoulli Random Process

The Bernoulli Random Process (a.k.a. Binary White Noise) falls into the class of discrete random sequences. The member functions of the random process, X(nT) assume only one of two values, typically +1 or -1 (but sometimes 1 and 0). The probability of +1 is denoted as p, and the probability of -1 is denoted as q=1-p. (i.e. P(1)=p, P(-1)=1-p) When p=q=0.5, the process is known as binary white noise.

From a computational perspective it is easy to model. We simply generate a uniformly distributed random variable, and then "switch" a random signal between +1 or -1 according to the magnitude of the uniform RV. If the RV is less than q, we make the signal values -1, if it is greater than q, we make the signal value +1. (i.e. The Transformation Method) The MATLAB code that generates this type of process can be found in our previous notes and is available on the ftp site as bernproc.m.

In the figure below, the first trace shows a binary white noise signal, a Bernoulli random process with p=q=0.5. The second trace shows the average formed from consecutive values of the Bernoulli RP. It is clear that the moving average process has ONLY three possible states {-1, 0, 1}, making a mathematical treatment relatively easy. To test whether the process satisfies the Markov requirement, we need to detemine whether the probability of any value of the moving average signal can be determined solely from the previous value.

 

Let's think about the moving average process mathematically for a few moments.

 

 

These types of processes are often used to numerically integrate or "smooth" measured data where the moving average would be calculated over more values of a signal. A similar process that would approximate differentiation would be to form a first difference rather than a sum.

If we first generate a Bernoulli random process with p=q=0.5, this means that the values of +1 and -1 occur with equal probability, namely 0.5 or 1/2. Suppose the sequence has the following pattern {.....-1, -1, 1, 1. ....}. To test for the Markov property, we must determine the probabilities associated with different values of Yn and show that knowledge of the previous value is either sufficient or insufficient to determine the probability of the next value. This mathematical argument follows.

Basically, there are three possible values for Y, namely -1, 0 and 1, that occur with the following probabilities.

 

Let's now consider the conditional probability that Yn takes on the particular value 1 after it is 0, that is, Yn=1 and Yn-1=0. The question we must answer is whether this conditional probability changes depending on the value of Yn-2? If so, simply knowing the previous value of Y is insufficient to specify the probability of the current value.

 

 

Example 3: A Random Telegraph Process

A random telegraph signal is a signal that switches between the values of +1 and -1. The switching times occur randomly and are exponentially distributed and the initial value is either +1 or -1 with equally probability. The number of switching events in any time interval then is Poisson.

 

We know something about the statistics of the process. For example, we can determine the probabilities P(+1) and P(-1), that is the likelihood of the signal being +1 and -1 at any specific time, by recognizing that +1 corresponds to an even number of events if X(0)=1 and -1 corresponds to an odd number of events. It is possible to show that,

 

To test whether the process is Markov is a simpler task. We need to determine whether the probability of the signal after an event is determined solely by the previous value of the signal. Intuititvely, this must be the case and the random telegraph signal is then a continuous-time Markov process. The state transition probabilities must be those shown below.

State

Signal Value

P(transition to +1)

P(transition to -1)

Sum of Transition Probabilities

1

+1

a=0

1-a=1

1

2

-1

1-b=1

b=0

1

 

State transition diagrams (trellis diagrams)

Markov processes are represented graphically using state transition diagrams like the one shown in the class handout. We need to establish some vocabulary so that we all know what we are talking about.

Some terminology

A state si is referred to as a transient state if there exists any other state to which the Markov process can move to but from which it cannot return.

A state si is referred to as a recurrent state if once the process exits the state it is always possible to return to that state.

A Markov process is referred to as ergodic if it is possible to go from every state to every other state. (For this to be the case, all elements of the state transition matrix must be non-zero.)

A Markov process or chain is said to be in equilibrium or steady-state when the probability of being in any particular state becomes a constant independent of the step or initial condition.

 

The mathematics

Transition probabilities

The state transitions in a Markov process are modeled as conditional probabilities. In general, these are written as

Where si = the current or initial state that occurs at n-1, sj = the next state that occurs at n, and Pij is the conditional probability of moving from state si to state sj. These probabilities are all independent of n.

The state transition probabilities are typically organized into matrix form as the transition probability matrix. Thus, for a three-state Markov process, the state transition probabilities are written as,

 

Here, P12 is the conditional probability that the process will move from state 1 to state 2. Likewise, P11 is the conditional probability that the system will stay in state 1 if it is already there. Thus all elements of T must be positive and less than or equal to 1.

One property of state transition matrices is that the sum of each row must be equal to one. This is because the probability of either staying put or moving on encompasses all possible transitions for any state.

Some calculations

Suppose the Markov process starts in state si, we can calculate the probability that after m-state transitions, the process will be in state sj using the expression,

 

This expression becomes complicated to use when the number of possible states k, is large. When this is not the case, the expression is manageable and somewhat intuitive. Let's look at example 5.28 for a few moments. Although we will not pursue the mathematical analysis of Markov processes or chains much further for now, it is important to recognize that this is possible.

 

Another example: A Wiener Random Process [Pg. 167-169]

A discrete Weiner random process is a random walk process with p=q=0.5, and is known as binary white noise. A Brownian motion process, aka a continuous Weiner process or a Weiner-Levy process, is a random walk where the interval between consecutive values of the random sequence approaches zero. The process is a Markov process since (a) the current value of the process depends on the previous value, and (b) the magnitude of the change in the process is Gaussian with the change being +ve or -ve with equal probability. The m-file for this simulation is available as weiner.m in the course ftp directory.

 

 

As we begin our next course topic we may use this type of process to synthesize random thermal noise in an electronic system.