Sunday, June 7, 2020

Computing arithmetic gymnastics

Today was a day well spent on programming a bruteforce solution to a simple arithmetic puzzle. The puzzle is something I came up with during a daydream a long time ago. It’s simple arithmetic gymnastics for when you have nothing better to do.

You start off with a 4 digit number, let’s take 1234 as an example. The goal is to create a sum out of the separate digits with any of the four basic mathematical operations: add, subtract, multiply or divide, where the answer to this sum must be exactly zero. You may switch the order of any digit, and add parentheses everywhere.

Correct examples in this case would be:

1 + 2 + 3 + 4 = 10
1 - 2 + 3 - 4 = -2
(1 + 3 - 4) * 2 = 0

Where the last example would satisfy the condition and create a solution in this case.

I found that every number I tried had a solution, and I could not find a number that was not solvable. However the question remained: is every 4 digit number solvable? So today I set out to find the answer.

Programming out the problem

When we calculate a solution to this problem, we often use heuristics to quickly find obvious solutions. The most obvious strategy is that when a number contains a zero, you can just multiply all numbers and end with a zero. Another simple strategy is found when you have two digits that are the same, subtracting the two will create your zero and hence the solution again.

Before arriving at heuristics to speed up computation I wanted to solve this problem the hard way: brute forcing a solution. Since computers can do way more instructions per second than I do solving simple arithmetics, this should be easy to solve for a problem this size. I did not care for performance at first, the simplest solution would do. The language of choice was C# for its ease of use.

The problem at hand is easy to imagine, since we are very used to calculating these numbers daily. It is however harder to program out all possible solutions. The method I used to arrive at a solution was to use some combinatorics.

Part one: calculating all possible number combinations

The first observation I made was that some of the operations were not commutative: subtraction and division produce different results when you swap the digits around. For this reason we would need all permutations of the numbers instead of combinations, since the order in which numbers appear does matter!

Creating a recursive function to generate all possible digit permutations looks as follows:

private static IEnumerable<int[]> CalculatePermutations(IEnumerable<int> list, int length)
{
    if (length == 1)
        return list.Select(t => new [] { t });

    return CalculatePermutations(list, length - 1)
        .SelectMany(x => list.Where(element => !x.Contains(element)), (t1, t2) => t1.Concat(new [] { t2 })
        .ToArray());
}

I create all permutations up to a given length, which is useful if we want to solve the problem with more or less than 4 digits later on. Note that this function is used to create permutations of unique indices and not the numbers themselves. This has two advantages: we can move the computation out of the for loop for every number, and we don’t have to deal with duplicate number permutations.

Part two: all possible operator combinations

Similar to the digits we will have to fill N-1 operators between the digits with any of the arithmetics. However in this case we need permutations with repetitions, as the multiply operator could be used multiple times for example.

We can realize this by creating the same function as previously mentioned with a small change: we don’t filter results that have duplicates:

private static IEnumerable<T[]> CalculatePermutationsWithRepetition<T>(IEnumerable<T> list, int length)
{
    if (length == 1)
        return list.Select(t => new [] { t });

    return CalculatePermutationsWithRepetition(list, length - 1)
        .SelectMany(x => list, (t1, t2) => t1.Concat(new [] { t2 })
        .ToArray());
}

Stringing it all together

Now that we have all possible number and operator permutations we can start adding them together. The sum will be solved from left to right, simply accumulating the result during the process. Note that all numbers will have to be floating point numbers since we involve a division operator that can lead to fractions.

But what about the parentheses? Since we have all possible permutations for every number and for every operator, we generate all possible orders. However this does not include all possible parentheses, it's easy to see that the combination:

1 / (2 + 3 + 4) 

is never included with the possible orders since we only calculate the sum from left to right. I have decided to leave out non-commutative solutions as this would introduce another set of permutations, and thus increasing the already factorial runtime even more.

The findings

I ran the simulation for multiple solution spaces, 1 digit numbers up to 7 digit numbers. For each of these ranges I calculated the amount of numbers that had a solution and converted those to percentages as seen in the graph below.


The 0-10 digit range is easy to explain, one zero equals 10 percent. The two digit realm doubles these numbers, as you have numbers containing zeroes (10, 20, etc) and one other combination of similar digits per each ten numbers (11, 22, etc) adds up to 20 percent. I expected the three digit numbers to be very hard to solve, as it’s quite easy to come up with numbers that can’t be solved. I never thought that it would practically be little more than a coinflip whether or not a random number could be solved!

The four digit range will finally answer my burning question of the ages: 97.3 percent of all numbers in this range had an answer! That explains why I never found a number that could not be solved. Basically one in every 40 four digit numbers you come up with should be impossible to solve, yet I never encountered one thus far.

As I initially expected, if you have more numbers to work with the puzzle should get more probable to have a solution. It probably isn’t easier to solve, but the solution is out there!

While the graph above has a neat explanation for each range, the numbers are a bit incorrect, here’s why: in the range of 10-100 we consider 90 numbers, some of which are double and some of which are not. For example 35 and the other permutation 53 both appear, but this is not the case for 60, since “06” was a single digit number. The graph below shows the percentages if we take this minor difference into account:


Digging deeper

Of course I didn’t stop here. While I was implementing this solution I started out with numbers that were only integers to test. Using only integers, I could not use divisions. I wondered if there were any numbers that only had a solution with one or more divisions.

This resulted in arguably the most interesting result from the whole experiment. The exact count of numbers that only have a solution using one or more divisions in the 4 digit range: 24. As I soon realized this is equal to the amount of permutations of a 4 digit number!

That means there is exactly one number (in a variety of different orders), and that number is 3468. I took the moment to appreciate the beauty here and just calculated numerous ways of solving it.

What is it about that 2.4% of numbers that have no solution? Strikingly similar to the above, 2.4% of 10000 equals 240 numbers. Once more we can see that these are permutations of 10 unique numbers. These are the 10 numbers:

1468
1479
1579
3678
3789
4568
4678
4689
5679
5789

Unfortunately I can’t find some black magic that ties all these numbers together yet, but maybe you can! Be sure to let me know! Below are two data pieces I gathered for all of these numbers: their sum, and the permutation of digits and operators that came the closest to zero.



The final thing I calculated is what would change if the solution was a different number than zero:
Which looks like something close to a skewed normal distribution. Interesting solutions were -24 with just over 50% and 72 with just under 50% of all numbers generating a solution.

Conclusion

Thanks for taking the time to read about this overanalyzed strange number game. I hope you still learned a thing or two from this futile effort to promote my nonsense arithmetics. If you enjoy me getting enthusiastic about absolutely nothing, feel free to bookmark this page.