Bounds on Lifting Continuous-State Markov Chains to Speed Up Mixing

Kavita Ramanan¹ &
Aaron Smith²

296 Accesses
2 Citations
Explore all metrics

Abstract

It is often possible to speed up the mixing of a Markov chain $\{ X_{t} \}_{t \in \mathbb {N}}$ on a state space $\Omega $ by lifting, that is, running a more efficient Markov chain $\{ \widehat{X}_{t} \}_{t \in \mathbb {N}}$ on a larger state space $\hat{\Omega } \supset \Omega $ that projects to $\{ X_{t} \}_{t \in \mathbb {N}}$ in a certain sense. Chen et al. (Proceedings of the 31st annual ACM symposium on theory of computing. ACM, 1999) prove that for Markov chains on finite state spaces, the mixing time of any lift of a Markov chain is at least the square root of the mixing time of the original chain, up to a factor that depends on the stationary measure of $\{X_t\}_{t \in \mathbb {N}}$. Unfortunately, this extra factor makes the bound in Chen et al. (1999) very loose for Markov chains on large state spaces and useless for Markov chains on continuous state spaces. In this paper, we develop an extension of the evolving set method that allows us to refine this extra factor and find bounds for Markov chains on continuous state spaces that are analogous to the bounds in Chen et al. (1999). These bounds also allow us to improve on the bounds in Chen et al. (1999) for some chains on finite state spaces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speeding up Markov chains with deterministic jumps

Article 22 September 2020

On Approximating the Stationary Distribution of Time-Reversible Markov Chains

Article 05 April 2019

Computing Continuous-Time Markov Chains as Transformers of Unbounded Observables

Notes

The constant $\frac{1}{2}$ here is arbitrary; replacing the constant $\frac{1}{2}$ with any constant $0< c < 1$ would result in similar bounds, though perhaps with slightly different constants.
This proof is given for Markov chains on discrete state spaces, but the same bound holds for continuous-state Markov chains with minor changes in notation.
Again, the result is stated for chains on discrete state spaces but applies more generally to continuous-state Markov chains.
We point out that such a function $x^{*}$ always exists. Note that the containment condition actually determines the value of $x^{*}$ except on the set $A = \{x \in \mathbb {R}^{d} \, : \, \exists \, 1 \le i \le d \, \text { s.t. } x_{i} \in \frac{1}{N} \mathbb {Z} \}$. Furthermore, $x^{*}$ is continuous on $A^{c}$ and A has measure 0. It is clear that the domain of $x^{*}$ can be extended from $A^{c}$ to $\mathbb {R}^{d}$ in a measurable way, e.g., by choosing the smallest allowed value in the usual lexicographic order on $\mathbb {R}^{d}$.

References

Chen, F., Lovász, L., Pak, I.: Lifting markov chains to speed up mixing. In: Proceedings of the 31st Annual ACM Symposium on Theory of Computing, pp. 275–281. ACM (1999)
Diaconis, P.: Group Representations in Probability and Statistics, volume 11 of IMS Lecture Notes—Monograph Series. Institute of Mathematical Statistics, Hayward (1988)
Diaconis, P., Holmes, S., Neal, R.M.: Analysis of a nonreversible Markov chain sampler. Ann. Appl. Probab. 10(3), 726–752 (2000)
Article MathSciNet MATH Google Scholar
Diaconis, P.: The Markov chain Monte Carlo revolution. Bull. Am. Math. Soc. 46(1), 179–205 (2008)
Article MathSciNet MATH Google Scholar
Diaconis, P., Miclo, L.: On the spectral analysis of second-order Markov chains. Ann. Fac. Sci. Toulouse Math. 22, 573–621 (2012)
Article MathSciNet MATH Google Scholar
Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. Int. Stat. Rev. 70(3), 419–435 (2002)
Article MATH Google Scholar
Gomez, F., Schmidhuber, J., Sun, Y.: Improving the asymptotic performance of Markov chain Monte-Carlo by inserting vortices. Advances in Neural Information Processing Systems (2010)
Levin, D., Peres, Y., Wilmer, E.: Markov Chains and Mixing Times. American Mathematical Society, Providence (2009)
MATH Google Scholar
Morris, B., Peres, Y.: Evolving sets, mixing and heat kernel bounds. Probab. Theory Relat. Fields 133, 245–266 (2006)
Article MathSciNet MATH Google Scholar
Neal, R.: Improving asymptotic variance of MCMC estimators: Non-reversible chains are better. Technical Report No. 0406, Department of Statistics, University of Toronto (2004)
Peres, Y., Sousi, P.: Mixing times are hitting times of large sets. J. Theor. Probab. 28(2), 488–519 (2015)
Article MathSciNet MATH Google Scholar
Rosenthal, J.: Random rotations: characters and random walks on $\text{ SO }(n)$. Ann. Probab. 22, 398–423 (1994)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The first author, KR, is partially supported by NSF Grant DMS-1407504 and the second author, AMS, is partially supported by an NSERC Grant. This project was started while AMS was visiting ICERM, and he thanks ICERM for its generous support.

Author information

Authors and Affiliations

Division of Applied Mathematics, Brown University, Providence, RI, USA
Kavita Ramanan
Department of Mathematics and Statistics, University of Ottawa, Ottawa, ON, Canada
Aaron Smith

Authors

Kavita Ramanan
View author publications
You can also search for this author in PubMed Google Scholar
Aaron Smith
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aaron Smith.

Appendices

Appendix 1: Comparison to Previous Results on Discrete Spaces

We point out that Theorem 2 can give better bounds than Theorem 3.1 of [1] in some cases. Let $\{ X_{t} \}_{t \in \mathbb {N}}$ be a Markov chain on a finite state space that satisfies Assumptions 2.7, with associated conductance $\varPhi $, mixing time $\tau $, and constants $\beta , \gamma > 0$ satisfying Eq. (2.13). Then let $\{ \widehat{X}_{t} \}_{t \in \mathbb {N}}$ be a lift of $\{ X_{t} \}_{t \in \mathbb {N}}$ with mixing time $\widehat{\tau }$ and conductance $\widehat{\varPhi }$. By the same calculation as contained in inequality (4.11) (but omitting the lines involving superscript N’s), we have:

$$\begin{aligned} \widehat{\tau } \ge \frac{\sqrt{\gamma }}{32 \sqrt{ \log \left( \frac{4}{\pi _{*}} \right) \log \left( \frac{\sqrt{\beta }}{2} \right) }} \sqrt{\tau } \end{aligned}$$

(6.1)

We give an example for which this bound is better than Theorem 3.1 of [1]. We fix $k, n \in \mathbb {N}$ and consider a random walk $\{ X_{t} \}_{t \in \mathbb {N}}$ on the cycle $\mathbb {Z}_{kn} = \{1,2,\ldots , nk \}$. Recall that the usual graph distance on $\mathbb {Z}_{kn}$ is given by

$$\begin{aligned} d(x,y) = \min (|x-y|, nk - |x-y|) \end{aligned}$$

We then define the transition kernel K by:

$$\begin{aligned} \begin{aligned} K(i,i)&= \frac{1}{2} \\ K(i,j)&= \frac{1}{2k}, \qquad 0 < d(i,j) \le k \\ K(i,j)&= 0, \qquad d(i,j) > k. \\ \end{aligned} \end{aligned}$$

(6.2)

This walk has stationary distribution $\pi (x) = \frac{1}{kn}$ for all $x \in \mathbb {Z}_{kn}$. It can be represented according to the form of Eq. (6.2) using the weights

$$\begin{aligned} \begin{aligned} K_{x,\{x\}}&= \frac{1}{2}, \\ K_{x, \{x-k,\ldots ,x-1,x+1,\ldots ,x+k\}}&= \frac{1}{2}, \\ K_{x,U}&= 0 \qquad \text {otherwise.} \end{aligned} \end{aligned}$$

(6.3)

Using arguments identical to those in Theorem 2 of Chapter 3 of [2], the mixing time $\tau $ of this walk can be shown to be $\tau = \varTheta (n^{2})$. Theorem 3.1 of [1] (combined with remark 2.1) implies that any lift of this chain must have mixing time at least $\widehat{\tau } = \varOmega \left( \frac{n}{\sqrt{\log (kn)}} \right) $.

In the other direction, set $\beta = \frac{1}{12n}$, $\gamma = \frac{3}{4}$. For any set $S \subset \mathbb {Z}_{kn}$ with $\pi (S) \le \frac{1}{12n}$,

$$\begin{aligned} \frac{Q(S,y)}{\pi (y)} \le \frac{1}{2} + \frac{1}{2k} | \{ x \in S \, : \, 1 \le d(x,y) \le k\} | \le \frac{3}{4}. \end{aligned}$$

Thus, part (1) of Assumptions 2.7 is satisfied with $\beta = \frac{1}{12n}$ and $\gamma = \frac{3}{4}$. Using the representation of K given by Eq. (6.3), we have $\pi _{*} = \frac{2}{n}$. By inequality (6.1), then, we find $\widehat{\tau } = \varOmega \left( \frac{n}{\log (n)} \right) $.

Let $\{ k(\lambda ) \}_{ \lambda \in \mathbb {N}}$ be a sequence with the property $k(\lambda ) \gg n(\lambda )^{\log (n(\lambda ))}$. Then let $\{X_{t}^{(\lambda )}\}_{t \ge 0}$ be the Markov chain with kernel given by Eq. (6.2) with $k = k(\lambda )$. In this regime, the bound from inequality (6.1) is tighter than that of Theorem 3.1 of [1]. More generally, we expect inequality (6.1) to be tighter than Theorem 3.1 of [1] when the discrete state space $\varOmega $ of the Markov chain $\{X_{t}\}_{t \ge 0}$ is extremely large compared to the support of $K(x,\cdot )$ for all x.

Appendix 2: Mixing Bound for the Continuous Cycle Walk

In this section, we show that the kernel K from example 1 has mixing time $\tau = \varTheta (c^{-2})$. Let $\mu $ be the uniform measure on $[-c,c]$ as a subset of the torus [0, 1]. It is well known that, as a compact abelian Lie group, the characters of the torus [0, 1] are given by $\chi _{n}(x) = e^{2 \pi inx}$. We calculate

$$\begin{aligned} \int _{[0,1]} \chi _{n}(x) \mathrm{d}\mu (x)&= \frac{\sin (2 \pi n c)}{2 \pi n c}. \end{aligned}$$

By Lemma 4.3 of [12], this means that for $A >1$, $T > A c^{-2}$ and $S \subseteq [0,1]$ measurable, we have for all $c < C_{0}$ sufficiently small that

$$\begin{aligned} \vert K^{T}(x,S) - \pi (S) \vert&\le \frac{1}{4} \sum _{n>1} \left( \frac{\sin (2 \pi n c)}{2 \pi n c}\right) ^{T} \\&\le \frac{1}{4} \sum _{n=2}^{\lfloor \frac{1}{20 \pi \, c} \rfloor } \left( \frac{\sin (2 \pi n c)}{2 \pi n c}\right) ^{T} + \frac{1}{4} \sum _{n=\lfloor \frac{1}{20 \pi \, c} \rfloor }^{\lceil \frac{1}{4 c} \rceil } \left( \frac{\sin (2 \pi n c)}{2 \pi n c}\right) ^{T}\\&\quad + \frac{1}{4}\sum _{n> \lceil \frac{1}{4c} \rceil } \left( \frac{\sin (2 \pi n c)}{2 \pi n c}\right) ^{T} \\&\le \frac{1}{2} \sum _{n=2}^{\lfloor \frac{1}{20 \pi \, c} \rfloor } \left( \frac{\sin (2 \pi n c)}{2 \pi n c}\right) ^{T} + \frac{1}{2} \sum _{n=\lfloor \frac{1}{20 \pi \, c} \rfloor }^{\lceil \frac{1}{4 c} \rceil } \left( \frac{\sin (2 \pi n c)}{2 \pi n c}\right) ^{T} \\&\quad + \frac{1}{4}\sum _{n> \lceil \frac{1}{2c} \rceil } \left( \frac{\sin (2 \pi n c)}{2 \pi n c}\right) ^{T} \\&\le \frac{1}{2} \sum _{n=2}^{\lfloor \frac{1}{20 \pi \, c} \rfloor } \left( 1 - \frac{9}{10} \frac{(2 \pi n c)^{2}}{6} \right) \\&\quad + \frac{1}{2} \sum _{n=\lfloor \frac{1}{20 \pi \, c} \rfloor }^{\lceil \frac{1}{4 c} \rceil } \left( 0.999 \right) ^{T} + \frac{1}{4}\sum _{n > \lceil \frac{1}{2c} \rceil } \left( \frac{1}{2 \pi n c}\right) ^{T} \\&\le \frac{1}{2} \sum _{n=2}^{\infty } e^{- \frac{36 \pi ^{2}}{60} A \, n^{2}} + c^{-1} \left( 0.999 \right) ^{A c^{-2}} + \frac{1}{4} \int _{\frac{1}{2c} -2}^{\infty } \left( \frac{1}{2 \pi n c}\right) ^{T} \mathrm{d}x \\&\le e^{- \frac{36 \pi ^{2}}{60} A } + c^{-1} \left( 0.999 \right) ^{A c^{-2}} + \frac{1}{4} \frac{1}{\frac{1}{2c} - 2} \frac{1}{A c^{-2} - 1} \pi ^{-A c^{-2}}. \end{aligned}$$

Since each of the three terms in the final line can be made arbitrarily small (uniformly in $c < C_{0}$) by choosing A large, this implies $\tau = O(c^{-2})$.

To show the reverse inequality, let $T(c) = A_{c} c^{-2}$, for some sequence $A_{c} \rightarrow 0$ as $c \rightarrow \infty $. For a copy of the chain started at $X_{0}=0$, we have by Bernstein’s inequality $P[ \vert X_{T(c)} \vert > \frac{1}{10}] \le 2 e^{- \frac{3}{2 A_{c}} \frac{1}{30 + c} } \rightarrow 0$. Thus, $\limsup _{c \rightarrow \infty } \vert \vert \mathcal {L}(X_{T(c)}) - U \vert \vert _{TV} \ge \frac{1}{2}$, and so $\tau = \varOmega (c^{-2})$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramanan, K., Smith, A. Bounds on Lifting Continuous-State Markov Chains to Speed Up Mixing. J Theor Probab 31, 1647–1678 (2018). https://doi.org/10.1007/s10959-017-0745-5

Download citation

Received: 09 June 2016
Revised: 28 January 2017
Published: 09 March 2017
Issue Date: September 2018
DOI: https://doi.org/10.1007/s10959-017-0745-5

Keywords

Mathematics Subject Classification (2010)

60J05

Bounds on Lifting Continuous-State Markov Chains to Speed Up Mixing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Speeding up Markov chains with deterministic jumps

On Approximating the Stationary Distribution of Time-Reversible Markov Chains

Computing Continuous-Time Markov Chains as Transformers of Unbounded Observables

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Comparison to Previous Results on Discrete Spaces

Appendix 2: Mixing Bound for the Continuous Cycle Walk

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now

Navigation

Bounds on Lifting Continuous-State Markov Chains to Speed Up Mixing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Speeding up Markov chains with deterministic jumps

On Approximating the Stationary Distribution of Time-Reversible Markov Chains

Computing Continuous-Time Markov Chains as Transformers of Unbounded Observables

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Comparison to Previous Results on Discrete Spaces

Appendix 2: Mixing Bound for the Continuous Cycle Walk

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now

Search

Navigation