The Collatz Conjecture—A New Perspective?

The Collatz Conjecture sets up a simple procedure: given a positive integer \(n\), if it is even divide it by two, \(n/2\) and if it is odd multiply it by three and add one, \(3n + 1\). Repeat this indefinitely. The conjecture states that all sequences of numbers generated by this procedure will end up at \(1\).

It is a very simple mathematical problem to formulate and can be understood by anybody with very basic knowledge of mathematics. However, since it was presented by Lothar Collatz in 1937 it has evaded all efforts to answer it.

A View Through the Lens of a Matrix

To begin with it is useful to think about the Collatz Conjecture "backwards". Assume we start not from some arbitrary positive integer but instead start from \(1\) and then work upwards. Then we will multiply by two for all positive integers but when we arrive at a number from which we can subtract one and then divide by three and still have a positive integer then we get a new ”branch”. The first number for which this happens is \(4\). If we subtract \(1\) and then divide by \(3\) we are back at \(1\). So this doesn’t actually give a new branch but instead forms a loop from \(1\) to \(4\) and back again to \(1\). For the Collatz conjecture to be true this must be the only loop, and so far it is the only loop that has been found. If we continue from \(4\) we arrive at \(8\) and then \(16\). At \(16\) it is again possible to subtract one and divide by three and we get \(5\). Now \(5\) does form a new branch and we continue by multiplying by two. We get \(10, 20, 40\) and so on. Here we have a new branch already at \(10\).

Everytime we can subtract by \(1\) and divide by \(3\) we will get an odd number. This is often presented in the form of a tree structure. But if we take the step to assume that all odd numbers will be the start of a new branch we can instead represent this in a matrix where the the first row is precisely all odd numbers and all the columns are formed by simply multiplying the corresponding odd number by \(2\) indefinitely. It will look like this:

We can see that if we let \(i\) and \(j\) go to infinity this matrix contains all the positive integers. The first row contains all odd numbers and the rest of the rows will contain the even numbers. The first column will contain all powers of \(2^{n}\). This way of constructing the matrix gives a mathematical expression for every entry \(a_{i,j}\): \(2^{i-1}(2j-1)\). So every entry in this matrix is directly given by which row and in which column it is in. So for example the entry \(a_{7,9}\) is given by \(2^{7-1}(2\cdot 9 - 1)= 1088\) .

Now, looking at this matrix we can see that the regular "forward" Collatz algorithm gives us that for every even number we will simply walk up the column until we arrive at an odd number. When we arrive at the odd number in the column we add one and multiply by three which will send us to an even number in a new column. (Except if we have arrived at \(1\) and we are sent in the loop back to \(4\)). We can now formulate the Collatz Conjecture precisely in terms of these jumps. The Collatz Conjecture states that no matter where we start among all positive integers these jumps will eventually bring us to the first column where we will walk up and end up at \(1\). (That is, we will end up at some \(2^{n}\) and repeated division by \(2\) will give \(1\).)

Deriving the Mechanics of the Jump

And here the fact that we have an expression for every entry in our matrix becomes very useful.

If we look at the first row with the odd numbers we have \(i = 1\), so we have that entry \(a_{1,j}\) is given by \(2j-1\). If we apply the Collatz procedure to this we get

\(3(2j - 1) + 1 = 6j - 2\).

Now we can form the expression \((2(j + d) - 1)2^{i-1}\) where we have introduced a variable \(d\) to represent a difference or change of \(j\) for an entry \(a_{i,j+d}\). We write

\((2(j + d) - 1)2^{i-1} = 6j - 2\)

to see how much \(j\) changes when we apply the Collatz procedure to an entry in the first row of our matrix. In other words, how big will the jump \(d\) be for a given entry \(a_{1,j}\) in the first row to a new entry \(a_{i,j+d}\).

If we fix \(i\) to \(2\), i.e we look at the change of \(d\) when the Collatz procedure sends us from entry \(a_{1,j}\) to a new entry \(a_{2,j+d}\) in row \(2\), we get

\((2(j+d)-1)2^{2-1} = 6j - 2\)

and we get \(d = \frac{j}{2}\)

So, we can see that every jump that goes from an entry in the first row \(a_{1,j}\) to an entry in the second row \(a_{2,j}\) will be exactly \(j/2\), so we will jump to \(a_{2,\frac{3j}{2}}\).

This is interesting information. Given that we are working only in the domain of the positive integers, if multiplying by \(3/2\) should give us another positive integer, the number \(j\) must contain \(2\) as a factor. And it is quite clear from looking at the matrix that it will jump to the second row if and only if it contains \(2\) as a factor. We saw that for the row \(2\), \(d\) is positive. This means that the jump is to the right in our matrix, away from the first column where the Collatz Conjecture says that we will finally end up. However, for all other rows \(i\), \(d\) will be negative, and the jump will be to the left – towards the first column.

So, what do we get for the third row? Setting \(i = 3\) gives us \(d = \frac{1-j}{4}\) or

\(\large-d = \frac{j-1}{4}\)

So the jump will be circa \(j/4\) to the left. For \(i = 4\) we get

\(\large-d = \frac{5j - 3}{8}\).

For large \(j\) this will be close to \(5j/8\), so this will be a relatively bigger jump to the left than the jump to the right to the second row for a given \(j\). For \(i = 5\) we get

\(\large-d = \frac{13j - 7}{16}\)

So, yet a bigger jump (for large \(j\)).

In general we get

\[-d = \frac{j(2^{i-1}-3) +1}{2^{i-1}} - \frac{1}{2}\]

and for large \(j\) we have

\[-d ≈\frac{j(2^{i-1} - 3)}{2^{i-1}}\]

As we have seen for \(i = 2\) we have \(d = j/2\), for \(i = 3\) we have \(d = -j/4\).

A slightly more zoomed out matrix:

Statistical Gravity: The 25% Expected Descent

Here we can form a probability argument. Assume we pick a random number in our matrix. Half of the time this number will be in a column \(j\) that have \(2\) as factor, i.e it will be in a column that is represented by an even number. So one out of two times this number will jump \(j/2\) steps to the right, to the second row.

We can also see from our matrix that every fourth number will actually be sent to the third row. So if we pick a random number, \(1\) out of \(4\) times it will be sent \(≈ j/4\) steps to the left (for large \(j\)). It continues like this. \(1\) out of \(8\) times it will be sent to the fourth row and so on. The probability to land in row \(i\) will be \[P(i) = \frac{1}{2^{i-1}}\]

From this we can now estimate whether the average jump is to the left or to the right for a random number.

We define the ratio \(R\) between a column \(j\) and the next column \(j'\) given by the Collatz algorithm as \[R= \frac{j'}{j} =\frac{j+d}{j}=1 + \frac{d}{j}\]

This gives us \[ R = 1 +\frac{3 - (2^{i-1})}{2^{i-1}} =\frac{3}{2^{i-1}} \]

And the average multiplier \(M\) will be \[M = \prod_{i =2}^{\infty }\left(\frac{3}{2^{i-1}}\right)^{\frac{1}{2^{i-1}}}=\frac{3}{4}\]

This means that the average change from one odd number to the next in the Collatz algorithm will be a decrease of 25%. This is a strong heuristic argument for the truth of the Collatz Conjecture. However for this to be considered a proof the jump from one number to the next would have to be completely random which of course it is not.

This ratio between two odd numbers in the Collatz algorithm is an old result, but I think the matrix presented above gives a very nice visual representation of it. The matrix also seems to present us with patterns and structure that is not easily detected without it. When we introduce the arrows from the odd numbers to their corresponding numbers given by the Collatz algorithm we get a strong feeling that the Collatz Conjecture is true. It seems almost inevitable that all numbers will sort of "fall" to the left and end up at \(1\). This intuition is also strenghtened by the probabilistic reasoning above.

What this matrix perspective also shows is the source of "energy" that will drive any number up against the gradient. This "energy" comes from factors of \(2\) in \(j\). If \(j = m\cdot2^{k}\) we will jump precisely k steps to the right. And after those \(k\) steps to the right our new \(j'\) will have had all its factors of \(2\) replaced by factors of \(3\): \(j'=m\cdot3^{k}\). With no factors of \(2\) this \(j'\) will inevitably jump to the left.

Now the question is what can we say about the behaviour of \(j'=m\cdot3^{k}\)? It seems clear that we cannot say anything as precise as with a \(j\) that contains factors of \(2\). In fact, one strategy here could be to show that the entropy in the next step is so high as to basically be equivalent to a random event (or a pseudo-random event); and in this way make the probabilistic argument closer to a legitimate proof.

The Peak Paradox: Vertical Growth vs. Horizontal Collapse

Another strategy would be to show some form of upper bound for this \(j=m\cdot3^{k}\). A number that shows that this could be difficult is \(27 = 2\cdot2\cdot7-1\). Where \(j = 2^{1}\) and \(m = 7\). When we feed the number \(27\) to the Collatz algorithm we will eventually get to \(9232\).

On its way there it will raise and fall back and raise and fall back and end up at for example \(319 = 2\cdot5\cdot2^{5} -1\). So we have \(j=2^{5}\) and \(m = 5\). As the sequence continues we end up at \(911 = 2\cdot57\cdot2^{3}-1\) and this is what will set off the last ascent up to \(9232\) by way of \(3\) steps to the right to \(2\cdot57\cdot3^{3}-1 = 3077\) (column \(1539\)) from where it will jump to the left to \(9232\) in column \(289\). Here we again see why the perspective of the matrix is so useful. The last jump from \(3077\) to \(9232\) looks like a massive jump up, but in the perspective of the matrix this is a massive jump to the left and towards the first column. What looks like the largest jump away from \(1\) is in the matrix the largest jump towards (column) \(1\). The ratio is \(289/1539 ≈ \bm{0,1878}\) which is close to \(\bm{0,1875}\) which we expect for jumps to row \(5\) for large \(j\). It is a drop of \(≈81,22\%\).

Here is a visualisation of this, with just a regular numerical representation of the sequence compared to how it looks in our matrix:

The example of \(27\) shows a number that starts with only one factor of \(2\) in its column number (\(14 = 2\cdot7\)), which we have identified as a form of potential energy. But it still climbs to \(9232\) and on its way there it passes numbers as \(319\) with \(5\) factors of \(2\) in its column number (\(160 = 5\cdot2^{5}\)). It somehow accumulates a lot of energy to go against the gradient that isn't clearly visible from the start.

But if someone could show some limit to how much of this energy in the form of factors of \(2\) that can be accumulated from some given starting point then that seems like a possible strategy for a proof.

Just a last note of some interest is the fact that numbers with only factors of \(2\) in its column number, and which therefore have a lot of potential energy, will differ by only \(1\) from numbers in the first column (that have zero "potential energy") that will just walk straight up and hit \(1\). So any number \(2^{k}\) will be in the first column and will just be divided \(k\) times by \(2\) until it hits \(1\); while any number \(2^{k}-1\) will be in the first row and will begin by moving \(k - 1\) steps to the right. They differ by \(1\) but their behaviours are opposite.