Random Walk
2025-8-15

I came across the fact that a random walk on an infinite 2d grid almost surely returns to the origin, and thought the proof was pretty cool.


Precise Statement: Suppose you start at (0,0)(0,0). Every minute, you take one step in the +x+x, x-x, +y+y, or y-y directions, each with probability 14\frac14. You'll almost surely return back to (0,0)(0,0) at some point in the future.


First, let

F(x)=t=1ftxt F(x) = \sum_{t=1}^\infty f_tx^t

be the infinite series where ftf_t gives the probability of returning to the origin for the FIRST time on the ttth step of the walk. (The first few terms of F(x)F(x) are 14x2+564x4+11256x6+\frac14x^2 + \frac{5}{64}x^4 + \frac{11}{256}x^6 + \cdots).


Also, let

R(x)=t=1rtxt R(x) = \sum_{t=1}^\infty r_tx^t

be a similar series, only that rtr_t includes REPEATED visits to the origin as well. (Its first few terms are 14x2+964x4+25256x6+\frac14x^2 + \frac{9}{64}x^4 + \frac{25}{256}x^6 + \cdots).


Now FF and RR are related, in that any walk that visits the origin multiple times can be decomposed into smaller segments for every time it visits the origin. For example, the product F(x)F(x)F(x) \cdot F(x) represents a generating function for the probability of being at the origin for the 22nd time. It sums terms fxfyf_xf_y over all x+y=tx+y=t, representing the walks that first return to the origin on step xx and then again on step x+yx+y.


Similarly, F(x)3F(x)^3 is a generating function for the probability of being at the origin for the 33rd time, and in general, F(x)kF(x)^k represents the generating function for being at the origin for the kkth time. Thus, the probability of being at the origin at any time satisfies

R(x)=k=1F(x)k=F(x)+F(x)2+F(x)3+=F(x)1F(x). R(x) = \sum_{k=1}^\infty F(x)^k = F(x) + F(x)^2 + F(x)^3 + \cdots = \frac{F(x)}{1-F(x)}.

Rearranging, this gives

F(x)=R(x)R(x)+1. F(x) = \frac{R(x)}{R(x)+1}.

The ultimate goal is to show that the walk almost surely returns to the origin, or in more precise terms,

f1+f2+f3+=1. f_1 + f_2 + f_3 + \cdots = 1.

This is equivalent to evaluating F(1)F(1), since 1t=11^t = 1 for all tt. But from the above relation, we have F(1)=11R(1)+1F(1) = 1 - \frac{1}{R(1) + 1}, so it is enough to show that R(1)=R(1) = \infty diverges.


Claim: R(1)R(1) diverges

The idea is that rtr_t is much easier to compute than ftf_t, since there is no longer a restriction on the prior steps of the walk.


Let t=2nt = 2n (since any return to the origin must be at an even time); then r2nr_{2n} is the number of walks returning to the origin, divided by 42n4^{2n} total possible walks. Surprisingly, it's possible to compute this quantity exactly!


Instead of having four possible equally likely movements, look at each movement as two independent choices on whether to increase/decrease x+yx+y, and whether to increase/decrease xyx-y:
Translation


After 2n2n movements, we've made 2n2n red choices and 2n2n green choices. Since the origin has x+y=0x+y = 0, exactly half of the red choices are up-right, and the other half down-left. These can occur in any order, giving (2nn)\binom{2n}{n} possibilities. Similarly, since the origin has xy=0x-y=0, there are (2nn)\binom{2n}{n} ways to arrange the green choices.


Therefore, the total number of walks that return to the origin after 2n2n steps is (2nn)2\binom{2n}{n}^2, so

r2n=(2nn)242n. r_{2n} = \frac{\binom{2n}{n}^2}{4^{2n}}.

By Stirling's approximation, (2nn)4nπn\binom{2n}{n} \approx \frac{4^n}{\sqrt{\pi n}}, which gives

r2n=(2nn)242n42n42nπn=O(1/n). r_{2n} = \frac{\binom{2n}{n}^2}{4^{2n}} \approx \frac{4^{2n}}{4^{2n}\cdot \pi n} = O(1/n).

Since R(1)=r2+r4+r6+R(1) = r_2 + r_4 + r_6 + \cdots is now bounded by a harmonic series, it diverges, finishing the proof.


What is interesting to note is that in 3D, while the probability of r2nr_{2n} is not easy to determine, it's asymptotic to O(1/n1.5)O(1/n^{1.5}) instead of O(1/n)O(1/n). So in 3D, the sum R(1)R(1) actually converges, meaning F(1)<1F(1) < 1. Empirically, the probability of returning to the origin in 3D is close to 34%34\%.