Memperkirakan probabilitas transisi Markov dari data urutan


16

Saya memiliki rangkaian lengkap (tepatnya 432 pengamatan) dari 4 status : misAD

Y=(ACDDBACBAACABCADABA)

EDIT: The observation sequences are of unequal lengths! Does this change anything?

Is there a way of calculating the transition matrix

Pij(Yt=j|Yt1=i)
in Matlab or R or similar? I think the HMM package might help. Any thoughts?

eg: Estimating Markov chain probabilities


3
You have 4 states: S={1:=A,2:=B,3:=C,4:=D}. Let nij be the number of times the chain made a transition from state i to state j, for ij,=1,2,3,4. Compute the nij's from your sample and estimate the transition matrix (pij) by maximum likelihood using the estimates p^ij=nij/j=14nij.
Zen

These notes derive the MLE estimates: stat.cmu.edu/~cshalizi/462/lectures/06/markov-mle.pdf
Zen


@B_Miner could you write your code in pseudo-code form for me? Or explain it in lay terms... However I see it works in my R console.
HCAI

I have a question: I understand your implementation and it lokks fine to me, but i was wondering why can't i simply use the Matlab hmmestimate function to compute the T matrix? Something like: states=[1,2,3,4] [T,E]= hmmestimate ( x, states); where T is the transition matrix i'm interested in. I'm new to Markov chains and HMM so I'd like to understand the difference between the two implementations (if there is any).
Any

Jawaban:


18

Please, check the comments above. Here is a quick implementation in R.

x <- c(1,2,1,1,3,4,4,1,2,4,1,4,3,4,4,4,3,1,3,2,3,3,3,4,2,2,3)
p <- matrix(nrow = 4, ncol = 4, 0)
for (t in 1:(length(x) - 1)) p[x[t], x[t + 1]] <- p[x[t], x[t + 1]] + 1
for (i in 1:4) p[i, ] <- p[i, ] / sum(p[i, ])

Results:

> p
          [,1]      [,2]      [,3]      [,4]
[1,] 0.1666667 0.3333333 0.3333333 0.1666667
[2,] 0.2000000 0.2000000 0.4000000 0.2000000
[3,] 0.1428571 0.1428571 0.2857143 0.4285714
[4,] 0.2500000 0.1250000 0.2500000 0.3750000

A (probably dumb) implementation in MATLAB (which I have never used, so I don't know if this is going to work. I've just googled "declare vector matrix MATLAB" to get the syntax):

x = [ 1, 2, 1, 1, 3, 4, 4, 1, 2, 4, 1, 4, 3, 4, 4, 4, 3, 1, 3, 2, 3, 3, 3, 4, 2, 2, 3 ]
n = length(x) - 1
p = zeros(4,4)
for t = 1:n
  p(x(t), x(t + 1)) = p(x(t), x(t + 1)) + 1
end
for i = 1:4
  p(i, :) = p(i, :) / sum(p(i, :))
end

Looks great! I'm not sure what the 3rd line does in your code though (mainly because I'm familiar with Matlab). Any chance you could write it in matlab or pseudo-code? I'd be much obliged.
HCAI

2
The third line does this: the chain values are x1,,xn. For t=1,,n1, increment pxt,xt+1.
Zen

The fourth line normalizes each line of the matrix (pij).
Zen

Bare with my slowness here. I do appreciate the MATLAB code translation although I still can't see what it's attempting to do in your first for loop. The 3rd line from the original code is counting the number of times x goes from state xi to state xj? If you could say it in words I'd appreciate that a lot. Cheers
HCAI

1
No, x is just one row. Don't concatenate because you will introduce "false" transitions: last state of one line first state of the next line. You have to change the code to loop through the lines of your matrix and count the transitions. At the end, normalize each line of the transition matrix.
Zen

9

Here is my implementation in R

x <- c(1,2,1,1,3,4,4,1,2,4,1,4,3,4,4,4,3,1,3,2,3,3,3,4,2,2,3)
xChar<-as.character(x)
library(markovchain)
mcX<-markovchainFit(xChar)$estimate
mcX

1
user32041's request (posted as an edit instead of a comment since he/she lacks reputation): How can I coerce the transitionMatrix of the markovchainFit result to a data.frame?
chl

You can convert to data.frame using as(mcX,"data.frame")
Giorgio Spedicato

@GiorgioSpedicato can you comment on how to deal with sequences of unequal lengths (I cannot concatenate) please in your package?
HCAI

@HCAI, please see the current vignette page 35-36
Giorgio Spedicato

@GiorgioSpedicato thank you for the reference cran.r-project.org/web/packages/markovchain/vignettes/…. I still have n transition matrices, one for each sequence. What I’m after is one general one that takes into account all the sequence observations. Is there something I’m missing?
HCAI

2

Here is a way to do it in Matlab:

x = [1,2,1,1,3,4,4,1,2,4,1,4,3,4,4,4,3,1,3,2,3,3,3,4,2,2,3];
counts_mat = full(sparse(x(1:end-1),x(2:end),1));
trans_mat = bsxfun(@rdivide,counts_mat,sum(counts_mat,2))

Acknowledgement owed to SomptingGuy: http://www.eng-tips.com/viewthread.cfm?qid=236532

Dengan menggunakan situs kami, Anda mengakui telah membaca dan memahami Kebijakan Cookie dan Kebijakan Privasi kami.
Licensed under cc by-sa 3.0 with attribution required.