Talk:Markov chain/Archive 2

This is an archive of past discussions about Markov chain. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Ergodic Markov chains

The definition of an "ergodic Markov chain" is somewhat contentious. Depending on the source, aperiodicity is needed or not (that is, you can either define an ergodic Markov chain as an irreducible positive recurrent Markov chain, or as an irreducible, aperiodic, positive recurrent Markov chain -- the current article adopts the second option).

If you want to make the link with ergodic theory, then the right definition is the first one (see also my answer there : https://math.stackexchange.com/questions/152491/is-ergodic-markov-chain-both-irreducible-and-aperiodic-or-just-irreducible), maybe including null recurrence. Aperiodicity is just not needed to apply ergodic theorems. Also, if most sources use the second definition, I can try to make it up with higher quality sources for the first point, e.g. Corollary 5.12 in course notes by Martin Hairer (http://www.hairer.org/notes/Markov.pdf). — Preceding unsigned comment added by Datho (talk • contribs) 09:23, 9 December 2018 (UTC)

Closest reversible Markov chain

Does this material really belong here? It's not standard and refers to a recent research result. — Preceding unsigned comment added by 68.34.29.140 (talk) 13:30, 29 October 2017 (UTC)

Agreed, I removed it. Cheers, Malparti (talk) 16:03, 22 January 2020 (UTC)

Applications

The Applications section is growing out of control and lacks coherence and hierarchy. For example, highly specific applications (such as the study of solar irradiance) are put on the same level as physics or biology. Some applications are also mentioned several times (again, solar irradiance: in its own section and in section Probabilistic forecasting.

Given the high number of references to very recent and specific research papers, there is a possibility that some researchers are using this section to promote their work. It would be best to replace these with more general textbooks or links to internal articles.

Malparti (talk) 16:26, 22 January 2020 (UTC)

It would be a good opportunity to split this very long article into two: Markov chain, and Applications of Markov chains. Gaba ^(talk) 22:22, 13 May 2020 (UTC)

I agree with both, but I lean on the side of deleting irrelevant or unnoteworthy applications. I've believe the following sections to be unnoteworthy and worthy of deletion: Testing; Solar irradiance variability; Information and computer science (the content - perhaps this title deserves being reused); the first paragraph (but not the second) of Queueing Theory; Statistics; and every subsequent section. I would move Speech recognition section's contents into 'See also' section below. Instead of deleting any of those sections I think I would also be happier there sources to other wikipedia pages discussing the models in question. What do yall think? QueensanditsCrazy (talk) 02:40, 18 May 2020 (UTC)

Possible incorrect definition of ergodicity

The article has the following text in the ergodicity section:

A state i is said to be ergodic if it is aperiodic and positive recurrent. In other words, a state i is ergodic if it is recurrent, has a period of 1, and has finite mean recurrence time. If all states in an irreducible Markov chain are ergodic, then the chain is said to be ergodic.

This text seems wrong to me, or at least, if it's not wrong then it's using a definition of ergodicity that flatly contradicts the one on the linked ergodicity page.

Consider the following two examples. First, take a system with two states, and the transition matrix

   0 1
   0 1

i.e. the system always goes deterministically into state 1, regardless of the initial state. This system is ergodic in the sense that its transition matrix is aperiodic and irreducible, but it's not ergodic in the sense of this page, because state 0 is not recurrent.

On the other hand, consider a system with two states and the following transition matrix:

   1 0
   0 1

i.e. the system never changes state. This system is not ergodic in the usual sense because its transition matrix is reducible, but it is ergodic in the sense of this page, because each state is recurrent with period 1 and has a mean recurrence time of 1 time step.

My suspicion is that the text on this page is flat out wrong, but if it's not it requires a note about differing definitions of "ergodic" and a good reference to the sense in which it's being used here.

I note also that the second paragraph in that section has a mistake as well, unless it's using a different definition from the standard one: "More generally, a Markov chain is ergodic if there is a number N such that any state can be reached from any other state in any number of steps less or equal to a number N." I'm not sure but I guess it should probably say "greater than or equal to N". This pushes me a bit further towards thinking the other statements are mistakes rather than an alternative definition, and if I have time I'll re-write it. (No promises though.)

Nathaniel Virgo (talk) 16:18, 30 May 2020 (UTC)

I don't know how to respond. notice that, the fixed system example is wrong since we require the chain to be "aperiodic". — Preceding unsigned comment added by 141.226.14.9 (talk) 16:34, 3 June 2020 (UTC)

I don't know what you mean. Both of the examples I posted are aperiodic.

Just in case this is the issue: note that the first example *is* ergodic (but the definition currently given on the page says it isn't), while the second example is *not* ergodic (but the definition currently given on the page says it is.)

Nathaniel Virgo (talk) 02:53, 4 June 2020 (UTC)

Another thing to note: in other resources, I never heard a definition of a state being ergodic. This makes sense because in the usual definition ergodicity is really a global property. One can say that a matrix is aperiodic if every state is aperiodic, but one cannot make a similar statement about irreducibility.

Nathaniel Virgo (talk) 03:00, 4 June 2020 (UTC)

You are misreading the text. It says "If all states in an irreducible Markov chain are ergodic, then the chain is said to be ergodic." (emphasis mine). Your first example is not irreducible, because that requires all states to be able to be reached from all other states. Because you can't reach state 0 from 1, it is not irreducible. Put it another way, the communicating class {1} has 0 probability of exit.

Second example is irreducible as you already identified. As such is it not ergodic according to the definition on the page.

The definition on the page matches the one in the ergodicity page.

There is no mistake in the second paragraph. It means that you can connect any 2 states with a chain of at most N steps, meaning that there are no states with no such connection/one that requires infinite steps. It means that a) the chain is irreducible (obviously) b) All states are positive returning (A little harder to prove, but if you can prove that one state is returning then all of them are, because it's a class property)

You can define a state as ergodic, because all states in a class must be returning aperiodic if one of them is, so it is equivalent to a global property on the class. If the chain is irreducible, the class is the whole chain and the chain is ergodic.

Jump chain

The "Jump chain" section within Markov_chain#Continuous-time_Markov_chain is incomplete, as it does not define the transition matrix of the chain Y_n. (Presumably this is obtained from Q by zeroing out the diagonal and then renormalizing each row so that it sums to 1. That gives the transition probabilities conditioned on a nontrivial transition occurring.) Eclecticos (talk) 17:45, 27 June 2020 (UTC)

Potential split

It's remarkable to me that we don't have separate articles for discrete-time Markov chains and continuous-time Markov chains, instead just having this article for both—a long article where one has to get a fair way into the body to just get a DTMC/CTMC definition. I believe it would be clearer to have three articles: an article introducing DTMCs and giving basic theorems, properties and applications (lead understandable to high school students, most of the body to early undergraduates); an article on CTMCs of a similar scope; and this article, focused on history and applications with only a brief focus on the mathematical groundwork and maybe the similarities and differences between DTMCs and CTMCs (e.g. periodicity only applies to DTMCs, which this article currently does not make clear).

Is there opposition to this view? Are there pages somewhere that I've missed that cover the basics of DTMCs or CTMCs? — Bilorv (talk) 15:05, 14 July 2020 (UTC)

This sounds like a good idea to me. (I do not volunteer to help with the labor, though, just to give a vote of confidence.) The article Stochastic matrix is something like an article on the basics of DTMCs. Examples of Markov chains includes mostly discrete examples but isn't a proper intro article. --JBL (talk) 00:04, 21 July 2020 (UTC)

I've no objection in principle to a split, and your plan for what should go where sounds reasonable. XOR'easter (talk) 05:28, 21 July 2020 (UTC)

Okay, thanks for the comments. Since there's been no objection, I've gone ahead and done with this. Still a lot of cleanup needed at all three articles, but the length of this article has now been somewhat addressed. — Bilorv (talk) 21:07, 1 August 2020 (UTC)

Plagiarism in Harris Chains Section

The third edition of Rick Durrett's "Probability Theory and Examples", published in 2004, has the following to say about Harris chains (see page 322):

The developments here are motivated by three ideas. First, the proofs in the last two sections work if there is one point in the state space that the chain hits with probability one. (Think, for example, about the construction of the stationary measure in (4.3).) Second, a recurrent Harris chain can be modified to contain such a point. Third, the collection of Harris chains is a comfortable level of generality; broad enough to contain a large number of interesting examples, yet restrictive enough to allow for a rich theory.

The section of this article on Harris Chains was added in 2007: https://wiki.riteme.site/w/index.php?title=Markov_chain&diff=119170716&oldid=119072272. At the time, it read:

Many results for Markov chains with finite state space can be generated into uncountable state space through Harris chains. The main idea is to see if there is a point in the state space that the chain hits with probability one. Generally, it is not true for continuous state space, however, we can define sets A and B along with a positive number ε and a probability
measure ρ, such that

If $\tau _{A}=\inf\{n\geq 0:X_{n}\in A\}$ , then $P_{z}(\tau _{A}<\infty )>0$ for all $z$ .

If $x\in A$ and $C\subset B$ , then $p(x,C)\geq \epsilon \rho (C)$ .
Then we could collapse the sets into a auxiliary point α, and a recurrent Harris chain can be modified to contain α. Lastly, the collection of Harris chains is a comfortable level of generality, which is broad enough to contain a large number of interesting examples, yet restrictive enough to allow for a rich theory.

The phrasing hasn't changed much since then. Personally, this looks to me like a clear case of plagiarism, especially that last sentence. Furthermore, the section doesn't cite any sources. I intend to Be Bold and remove the section entirely (I lack enough understanding of Harris chains to write a suitable replacement). Hopefully someone knowledgeable can rewrite the section appropriately and add it back in. WallAdhesion (talk) 23:00, 3 April 2021 (UTC)

No knowledge about Harris chains from me, but can this be solved just by rephrasing? For instance, "The main idea is to see if there is a point in the state space that the chain hits with probability one" could be changed to "The principle applies to chains with some fixed state that is hit almost surely". As for the lack of sources... well, Probability Theory and Examples would be the source to cite, I guess. If this isn't possible, removing the section would be right because this is plaigarism as it stands. — Bilorv (talk) 19:00, 4 April 2021 (UTC)

Where is the reversible markov chain section?

There used to be a reversible Markov chain section here. Where is it? If anyone thinks this is out of scope for the main Markov article, then there should be at least a subarticle about this subject.

https://wiki.riteme.site/w/index.php?title=Markov_chain&oldid=689230061#Reversible_Markov_chain

Elenktik (talk) 14:26, 4 September 2021 (UTC)

@Elenktik: after a split, because the article was getting long and unstructured, the material was moved to Discrete-time Markov chain#Reversible Markov chain. There's also relevant content at Detailed balance#Reversible Markov chains. — Bilorv (talk) 16:42, 4 September 2021 (UTC)

@Bilorv: thank you for mentioning it. Strange that it was just not carried over to the new discrete Markov chain article, but removed from the original one. Anyway, I added to to the discrete Markov chain article. Elenktik (talk) 08:34, 8 February 2022 (UTC)

Looks like it wasn't there when I did the split (Special:Permalink/968914732), but thanks for investigating and re-adding it. — Bilorv (talk) 17:57, 8 February 2022 (UTC)

Markov Model - a generalisation or a special case?

The subsection Markov model presents Markov models as a generalisation of Markov chains. This is contrary to the subsection itself being placed in the section Special types of Markov chains. Which type of models is the subset of the other? Which are the more general framework? AVM2019 (talk) 19:37, 14 June 2022 (UTC)

To me, the term Markov model is a bit vague and could refer to any model having some sort of Markov property. A Markov process would refer to a family of random variables indexed by time, with the Markov property being the usual memorylessness. And a Markov chain would be even more specific, referring to a Markov process on a discrete state space. By default, that would refer to a discrete-time process -- otherwise I would indicate it explicitly by saying continuous-time Markov chain.

These conventions might vary a bit -- although I'm pretty confident everyone I know working in the field would agree with this. In the end it doesn't matter too much because I don't think this can be a real source of confusion. But, to come back to your original question, it's definitely {Markov chains} ⊂ {Markov models}, whatever the latter means.

Malparti (talk) 12:30, 16 September 2022 (UTC)

Yep, Markov chains are a type of Markov model. We could change the section header "Special types of Markov chains". Perhaps "Related models" would work. I don't think it makes sense to reorganise the content, even if we do have to find a section name that covers both subtypes and supertypes of Markov chains. — Bilorv (talk) 10:04, 19 September 2022 (UTC)

Convergence speed to the stationary distribution

I am studying the Markov chain now and I found some possible typo.

> π(k) approaches to π as k → ∞

~~I believe it should be "π(k) approaches a1*π as k → ∞".~~

> In other words, π = ui ← xPP...P = xPk as k → ∞.

It should be π = u1 instead of ui. Here, u1 represents the eigenvector of P corresponding to lambda1=1.

— Preceding unsigned comment added by Jgdyjb (talk • contribs) 05:02, 15 August 2023 (UTC)

Thanks for this. I believe you are correct and that I've fixed it here. In future, when you find a problem you need to fix it yourself as generally no-one else will. — Bilorv (talk) 09:33, 21 August 2023 (UTC)

This can be far more comprehensible for students.

Students may need to understand how to apply Markov Chain computations to the following example. A shepherd has a flock of sheep, three pastures and two sheep dogs, to graze seven days a week. He grazes the pastures on a rotating basis, working one dog at a time on alternating days. Say one is black and one is white. My student needs to compute when will the shepherd will work the black sheep dog in pasture 1 on a Thursday/all Thursday occurrences. I can make the chain more interesting by adding variables, and apply it to retail sales, cell phone tracking and re-stocking needs, for a mathematical application.

My interest in this discussion derives from an appreciation of exactly how good the IDEA encryption algorithm is, understanding that it derives from substitution on a password dependent Markov chain with an astronomical period/periodicity.

While it may be theoretically vulnerable, it is exceedingly difficult to cryptanalyze, and it should be understood by ENTRY LEVEL USERS to be very adequate, even for life and death uses. Formfollows (talk) 04:45, 3 October 2023 (UTC)