XOR decryption
Each character on a computer is assigned a unique code and the
preferred standard is ASCII (American Standard Code for Information
Interchange). For example, uppercase A = 65, asterisk (*) = 42, and
lowercase k = 107.
A modern encryption method is to take a text file, convert the bytes to ASCII, then XOR each byte with a given value, taken from a secret key. The advantage with the XOR function is that using the same encryption key on the cipher text, restores the plain text; for example, 65 XOR 42 = 107, then 107 XOR 42 = 65.
For unbreakable encryption, the key is the same length as the plain text message, and the key is made up of random bytes. The user would keep the encrypted message and the encryption key in different locations, and without both "halves", it is impossible to decrypt the message.
Unfortunately, this method is impractical for most users, so the modified method is to use a password as a key. If the password is shorter than the message, which is likely, the key is repeated cyclically throughout the message. The balance for this method is using a sufficiently long password key for security, but short enough to be memorable.
Your task has been made easy, as the encryption key consists of three lower case characters. Using cipher.txt (right click and 'Save Link/Target As...'), a file containing the encrypted ASCII codes, and the knowledge that the plain text must contain common English words, decrypt the message and find the sum of the ASCII values in the original text.
A modern encryption method is to take a text file, convert the bytes to ASCII, then XOR each byte with a given value, taken from a secret key. The advantage with the XOR function is that using the same encryption key on the cipher text, restores the plain text; for example, 65 XOR 42 = 107, then 107 XOR 42 = 65.
For unbreakable encryption, the key is the same length as the plain text message, and the key is made up of random bytes. The user would keep the encrypted message and the encryption key in different locations, and without both "halves", it is impossible to decrypt the message.
Unfortunately, this method is impractical for most users, so the modified method is to use a password as a key. If the password is shorter than the message, which is likely, the key is repeated cyclically throughout the message. The balance for this method is using a sufficiently long password key for security, but short enough to be memorable.
Your task has been made easy, as the encryption key consists of three lower case characters. Using cipher.txt (right click and 'Save Link/Target As...'), a file containing the encrypted ASCII codes, and the knowledge that the plain text must contain common English words, decrypt the message and find the sum of the ASCII values in the original text.
This question has a lot of ambiguity! Finally I have solved it and the algorithm is as follows:
Instead of finding if combinations of a-z will match with the given cipher text, we will find each and every character of the key.
We know that the first character will correspond to the first element in the cipher text, then the 4th, then 7th and so on. So we will take a character from a-z and check if any of the character when xor with the given 1st, 4th, 7th..... character will give commonly used english characters. In a similar way we will find it for second character and third character separately.
Writing the algorithm in steps, is as follows:
1) Create a function to check if xor of two ascii numbers is a commonly used english letter. Remember that the ASCII values for commonly used english letters is in the range of 32 - 90 and 97 - 122.
Screenshot of ASCII characters in english. Source: Wikipedia |
3) Start with opening the file(cipher.txt) given.
4) Split each and every number in the file to make a list.
5) As the list contains numbers in string format, convert each and every entry to an integer.
6) Create a variable to store all the small letters in english in the form of ASCII. The range of ASCII for small letters is 97-122. You can also use a direct xrange in the for loop if you want.
7) Now we will start by finding the first letter. Create a python set to store the values of the letters that can form the first letter in the password.
8) Create a for loop to loop through small letters in english. Create a nested for loop to loop through 1st(index 0), 4nd(index 3), 7th(index 6)..... elements in the cipher text. Now the condition that should be satisfied is, the given letter when xor with the elements in the cipher text till the end should give a value which is in the commonly used english letters range(Use the function created in Step 1). If the given letter when xor at any point is out of the range, then this will not be the first value. Go on iterating till the end and store all the possible values. We will only get one character that satisfies the condition.
9) Similar to step 8 find the second character by doing xor for 2nd(index 1), 5th(index 4), 8th(index 7)...... elements in the cipher text.
10) Again for elements 3rd(index 2), 6th(index 5), 9th(index 8).... in the cipher text you will find the third character in the password.
11) Now we have three python sets with the first, second, third letters of the password. At this point only one element will be in each of the list. Now change the variables from list to string, just by assigning the first element of the list to the corresponding variables.
12) Concatenate all the strings that we have found in Step 11, and this will give us the password.
13) Create a variable to store plain text that we are going to decrypt using the key we have got. Slice the cipher text with 3 elements each time and send it to the function which we have created in step 2. Add the returned value to the plain text. Do this till you have exhausted with the cipher text. At the end we will have the plain text with the decrypted message.
14) Convert each and every character in the string to ASCII values and find the sum. This will give us the result.
Program
Actually I have written the algorithm seeing the program, so that it will be easy for you to understand. Have a look at the algorithm if you haven't understood the program.You can download the source code from GitHub Gist 59.py
Output
Summary
To be frank, this problem(question) took 2 days for me to understand. Actually when I read the question for the first time, I thought that the password was of three characters and we will have to find each and every character in the cipher text by xor-ing the three character password with each and every character. I started searching if we can xor one character with three character. LOL! Finally I understood that the question is based on stream cipher and then found the solution. There is still some scope for improvement in this solution. First place we can improve is, where we find the letters. But as I am satisfied with the algorithm I have written and the execution time, I have not tried optimizing the code. Also the optimized code might be difficult for few beginners. Anyways this problem is related to real world again(second problem after poker). I personally liked this problem.Please excuse me and correct me if my grammar is wrong or in an ambiguous way.
As always you can comment in the comment box below if you have any doubt or haven't understood anything. I will be glad to help you.
Please do comment in the comment box below if you have found a typo or have a better program of have a different program. Please do to comment if you have any suggestion. I will be very happy to view each of them.
You can also contact me.
Thank you. Have a nice day😃!