Skip to main content

Problem 14 Project Euler Solution with python

Longest Collatz sequence

The following iterative sequence is defined for the set of positive integers:
nn/2 (n is even)
n → 3n + 1 (n is odd)
Using the rule above and starting with 13, we generate the following sequence:
13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.
Which starting number, under one million, produces the longest chain?
NOTE: Once the chain starts the terms are allowed to go above one million?

Seeing the question one might think that the problem is simple. But the real problem here is that the number is 1 million which is very big and the solution may take long time if you are not optimizing the code.

I wrote two programs for this problem, first one took around 17 seconds while the second one took around 1.5 seconds. Yes, I took some time optimizing the code and you can see the result. Staggering 90% optimization.

Okay let me first share with you the code and I will tell you, where I needed improvement. The code is as follows:

#http://radiusofcircle.blogspot.com

#time module for execution time
import time

#Time at the start of execution
start = time.time()

#collatz function straight forward
def collatz(n):
 """
 This function will take
 a number and it will return
 the size of the collatz
 sequence for input n"""
 counter = 1
 while n>1:
  if n%2 == 0:
   n = n/2
   counter += 1
  else:
   n = 3*n+1
   counter += 1
  if n == 1:
   return counter

#let the largest number be 0
largest_number = 0

#let the largest sequence size be 0
large_seq = 0

#for loop for 1 million
for i in xrange(1000000,1,-1):
 #size of collatz for the iteration
 n = collatz(i)
 #if size if greater than previous
 #replace it with the new one
 if n > large_seq:
  largest_number = i
  large_seq = n

#printing the largest number
print largest_number

#time at the end of execution
end = time.time()

#Printing the total time taken
print end-start
As you can see this code is the direct implementation of the question. It took 16 seconds(approximately) to run this code. I was not satisfied with the run time so I searched on internet for the mistake I made and found some of the errors I made. Let me explain the mistakes I made in this code.

If we were to consider first five numbers 1,2,3,4,5 then the collatz sequence would be as follows:
1
2 → 1
3 → 10 → 5 → 16 → 8 → 4 → 2 →1
4 → 2 → 1 
5 → 16 → 8 → 4 → 2 → 1 

As you can see, if we consider the collatz sequence of 3 then at some stage we have arrived at a point where we will have to repeat the sequence of 2.
Consider 4 we have again reached the sequence of 2.
Consider 5 we have reached the sequence of 4.

So if we were to stop repeating these sequences again and again then we will have a better and faster code. This is called as Dynamic programming and was suggested by Jesse Bender at Stack Overflow.

Some people also call it memorization or also partial caching

Program

The program is as follows:
See Python dictionary Comprehension used at line 10
One thing I want to explain in this code is the twist in the for loop from line 17. Lets take for example the iteration of 3.
When it starts the value of counter is 0.
Value of original_number is 3.
As the condition is satisfied we can enter into the while loop.
Condition for first if is not satisfied and the value of n changes to 10.
So the progress till now is:
3 → 10
Value of counter = 1
In a similar way we will progress as follows:
3 → 10 → 5 → 16 → 8 → 4 → 2
At this stage the first if statement will get executed and the size of two is added to the chain. But if you were to convert the sizes to the chain then your chain will be as follows:
 3 → 10 → 5 → 16 → 8 → 4 → 2 → 2 →1
But you might have a question on why we are getting the correct solution even if we are finding the length of the above solution.
Here is where you are assuming in a wrong way. if we were to go back to the first iteration then you can see that at the start the counter was 0 but not one even if the value of 3 was in the chain. And so eliminating 3 from the counter at first we have indirectly eliminated the extra 2 from our count.
And the rest of the code is well commented and I don't think it is necessary to explain.

You can download the code from Github Gist

Output

Summary

I have learned a new concept of dynamic programming by solving this problem. To be frank it took me around 1 day to find the solution to this problem. But I am really satisfied because the program is really optimized in time point of view and also in iterations point of view. In terms of performance wise I am really satisfied.

One more way in which we can make optimization is, if we consider the sequence of 3 the next number is 16 for which we can add the value to the dictionary and eliminate the iteration at 16 and so on. But I don't know how fast it will work because we are making new variables. Please try this for yourself if you want to.

If you didn't understand anything in this code then please do comment in the comment box below and I will glad to help you.

Please do comment if I have made any typo or if you want me to add any extra part to this post. Please do share your code with me, if you have a different solution or a better solution.

You can also contact me if you want to.

Thank you. Have a nice day😃.

Popular posts from this blog

Making a quiz web app with python and flask

Edit : When you are creating a web app with h tml templates, then y ou will have to sa ve the html file in templates folder in the Current Wor ki ng Directory( CWD). If you save the file in the C W D directl y you will get a TemplateNotFound error. Thank you Udhay for pointing it out.   In this post we will create a quiz website using python . I will be using the flask framework . After reading this tutorial you will learn form submission , flask templates , python code in flask templates , shuffling the questions and options with the random module and few others.  Please note that this tutorial is not big as it seems to be. Some of the code has been rewritten to maintain consistency and also font size is somewhat big so that your eyes won't get stressed reading this tutorial. Also the content has not occupied the full width of the page. In this tutorial I am assuming that you are having a very basic understanding of the flask framework . Please refer the documentation

Problem 11 Project Euler Solution with python

Largest product in a grid In the 20×20 grid below, four numbers along a diagonal line have been marked in red. 08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08 49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00 81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65 52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91 22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80 24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50 32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70 67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21 24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72 21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95 78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92 16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57 86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58 19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40 04 52 08 83 97 35 99 16 07

Problem 60 Project Euler Solution with python

Prime pair sets The primes 3, 7, 109, and 673, are quite remarkable. By taking any two primes and concatenating them in any order the result will always be prime. For example, taking 7 and 109, both 7109 and 1097 are prime. The sum of these four primes, 792, represents the lowest sum for a set of four primes with this property. Find the lowest sum for a set of five primes for which any two primes concatenate to produce another prime. This problem is j u st a brute force problem. If you have come here because you don't know the limit upto which you will h ave to gener ate the prime numbers t hen go ahe ad and t r y with 10,000 . When I first start ed solving the problem I chose 1 million(beca use most of the problem s on project E uler have this limit ), but it took very long for the computer to fin d the solution. After searching on the internet then I found many people choosing 10, 000 so I have changed my in put f rom 1 million to 10000 and the output was f ast. He