Memperkirakan probabilitas transisi Markov dari data urutan

Saya memiliki rangkaian lengkap (tepatnya 432 pengamatan) dari 4 status : mis $A-D$

Y = (\begin{array}{ccccccc} A & C & D & D & B & A & C \\ B & A & A & C & A & - & - \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ B & C & A & D & A & B & A \end{array})

$Y=\left(\begin{array}{c c c c c c c} A& C& D&D & B & A &C\\ B& A& A&C & A&- &-\\ \vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots\\ B& C& A&D & A & B & A\\ \end{array}\right)$

EDIT: The observation sequences are of unequal lengths! Does this change anything?

Is there a way of calculating the transition matrix

P_{i j} (Y_{t} = j | Y_{t - 1} = i)

$P_{ij}(Y_{t}=j|Y_{t-1}=i)$ in Matlab or R or similar? I think the HMM package might help. Any thoughts?

eg: Estimating Markov chain probabilities

r matlab markov-process

— HCAI
sumber

You have

4

$4$ states:

S = {1 := A, 2 := B, 3 := C, 4 := D}

$S=\{1:=A,2:=B,3:=C,4:=D\}$ . Let

n_{i j}

$n_{ij}$ be the number of times the chain made a transition from state

i

$i$ to state

j

$j$ , for

i j, = 1, 2, 3, 4

$ij,=1,2,3,4$ . Compute the

n_{i j}

$n_{ij}$ 's from your sample and estimate the transition matrix

(p_{i j})

$(p_{ij})$ by maximum likelihood using the estimates

{\hat{p}}_{i j} = n_{i j} / \sum_{j = 1}^{4} n_{i j}

$\hat{p}_{ij}=n_{ij}/\sum_{j=1}^4 n_{ij}$ .

— Zen

These notes derive the MLE estimates: stat.cmu.edu/~cshalizi/462/lectures/06/markov-mle.pdf

— Zen

Similar question:stats.stackexchange.com/questions/26722/…

— B_Miner

@B_Miner could you write your code in pseudo-code form for me? Or explain it in lay terms... However I see it works in my R console.

— HCAI

I have a question: I understand your implementation and it lokks fine to me, but i was wondering why can't i simply use the Matlab hmmestimate function to compute the T matrix? Something like: states=[1,2,3,4] [T,E]= hmmestimate ( x, states); where T is the transition matrix i'm interested in. I'm new to Markov chains and HMM so I'd like to understand the difference between the two implementations (if there is any).

— Any

Jawaban:

Please, check the comments above. Here is a quick implementation in R.

x <- c(1,2,1,1,3,4,4,1,2,4,1,4,3,4,4,4,3,1,3,2,3,3,3,4,2,2,3)
p <- matrix(nrow = 4, ncol = 4, 0)
for (t in 1:(length(x) - 1)) p[x[t], x[t + 1]] <- p[x[t], x[t + 1]] + 1
for (i in 1:4) p[i, ] <- p[i, ] / sum(p[i, ])

Results:

> p
          [,1]      [,2]      [,3]      [,4]
[1,] 0.1666667 0.3333333 0.3333333 0.1666667
[2,] 0.2000000 0.2000000 0.4000000 0.2000000
[3,] 0.1428571 0.1428571 0.2857143 0.4285714
[4,] 0.2500000 0.1250000 0.2500000 0.3750000

A (probably dumb) implementation in MATLAB (which I have never used, so I don't know if this is going to work. I've just googled "declare vector matrix MATLAB" to get the syntax):

x = [ 1, 2, 1, 1, 3, 4, 4, 1, 2, 4, 1, 4, 3, 4, 4, 4, 3, 1, 3, 2, 3, 3, 3, 4, 2, 2, 3 ]
n = length(x) - 1
p = zeros(4,4)
for t = 1:n
  p(x(t), x(t + 1)) = p(x(t), x(t + 1)) + 1
end
for i = 1:4
  p(i, :) = p(i, :) / sum(p(i, :))
end

— Zen
sumber

Looks great! I'm not sure what the 3rd line does in your code though (mainly because I'm familiar with Matlab). Any chance you could write it in matlab or pseudo-code? I'd be much obliged.

— HCAI

The third line does this: the chain values are

x_{1}, \dots, x_{n}

$x_1,\dots,x_n$ . For

t = 1, \dots, n - 1

$t=1,\dots,n-1$ , increment

p_{x_{t}, x_{t + 1}}

$p_{x_t,x_{t+1}}$ .

— Zen

The fourth line normalizes each line of the matrix

(p_{i j})

$(p_{ij})$ .

— Zen

Bare with my slowness here. I do appreciate the MATLAB code translation although I still can't see what it's attempting to do in your first for loop. The 3rd line from the original code is counting the number of times

x

$x$ goes from state

x_{i}

$x_i$ to state

x_{j}

$x_j$ ? If you could say it in words I'd appreciate that a lot. Cheers

— HCAI

No,

x

$x$ is just one row. Don't concatenate because you will introduce "false" transitions: last state of one line

\to

$\to$ first state of the next line. You have to change the code to loop through the lines of your matrix and count the transitions. At the end, normalize each line of the transition matrix.

— Zen

Here is my implementation in R

x <- c(1,2,1,1,3,4,4,1,2,4,1,4,3,4,4,4,3,1,3,2,3,3,3,4,2,2,3)
xChar<-as.character(x)
library(markovchain)
mcX<-markovchainFit(xChar)$estimate
mcX

— Giorgio Spedicato
sumber

user32041's request (posted as an edit instead of a comment since he/she lacks reputation): How can I coerce the transitionMatrix of the markovchainFit result to a data.frame?

— chl

You can convert to

d a t a . f r a m e

$data.frame$ using

a s (m c X, " d a t a . f r a m e ")

$as(mcX,"data.frame")$

— Giorgio Spedicato

@GiorgioSpedicato can you comment on how to deal with sequences of unequal lengths (I cannot concatenate) please in your package?

— HCAI

@HCAI, please see the current vignette page 35-36

— Giorgio Spedicato

@GiorgioSpedicato thank you for the reference cran.r-project.org/web/packages/markovchain/vignettes/…. I still have n transition matrices, one for each sequence. What I’m after is one general one that takes into account all the sequence observations. Is there something I’m missing?

— HCAI

Here is a way to do it in Matlab:

x = [1,2,1,1,3,4,4,1,2,4,1,4,3,4,4,4,3,1,3,2,3,3,3,4,2,2,3];
counts_mat = full(sparse(x(1:end-1),x(2:end),1));
trans_mat = bsxfun(@rdivide,counts_mat,sum(counts_mat,2))

Acknowledgement owed to SomptingGuy: http://www.eng-tips.com/viewthread.cfm?qid=236532

— John
sumber