Project Euler Problem 59 Statement
Each character on a computer is assigned a unique code and the preferred standard is ASCII (American Standard Code for Information Interchange). For example, uppercase A = 65, asterisk (*) = 42, and lowercase k = 107.
A modern encryption method is to take a text file, convert the bytes to ASCII, then XOR each byte with a given value, taken from a secret key. The advantage with the XOR function is that using the same encryption key on the cipher text, restores the plain text; for example, 65 XOR 42 = 107, then 107 XOR 42 = 65.
For unbreakable encryption, the key is the same length as the plain text message, and the key is made up of random bytes. The user would keep the encrypted message and the encryption key in different locations, and without both "halves", it is impossible to decrypt the message.
Unfortunately, this method is impractical for most users, so the modified method is to use a password as a key. If the password is shorter than the message, which is likely, the key is repeated cyclically throughout the message. The balance for this method is using a sufficiently long password key for security, but short enough to be memorable.
Your task has been made easy, as the encryption key consists of three lower case characters. Using 0059_cipher.txt (right click and 'Save Link/Target As...'), a file containing the encrypted ASCII codes, and the knowledge that the plain text must contain common English words, decrypt the message and find the sum of the ASCII values in the original text.
Solution
This solution determines a 3-character repeating key by assuming the most common encrypted value corresponds to a space character (' ') in plaintext. It uses this assumption to deduce the key by XORing the most frequent characters in each of the 3 groups. A sufficiently long string is required for this method to work.
We ignore the first input()
value and advance the input pointer. The value is the length of the encrypted string, but our solution doesn't require it.
message = list(map(int, input().split()))
Next, we read the encrypted message which contains space-separated integers, where each integer represents a single encrypted character. The code processes this input by first splitting it into individual strings, then using map()
to convert each string into an integer, and finally creating a list of these integers which is stored in the message variable.
For example, after this processing, message might contain values like [32, 66, 50, ...], where each number represents an encrypted character from the original text.
key = [collections.Counter(message[i::3]).most_common(1)[0][0]^ord(' ') for i in [0,1,2]]
This calculates the encryption key, which is 3-characters long.
message[i::3]
: Selects every 3rd element starting from index i (i.e., splitting the message into 3 groups for the 3 parts of the key).collections.Counter(...).most_common(1)
: Finds the most common value by counting the frequency of each number in the group.[0][0]
: Extracts the most common number from the result of most_common(1).^ord(' ')
: XORs the most common number with the ASCII value of a space (ord(' ')), as a space is assumed to be the most frequent character in English text.
Example Walkthrough
Hope I'm not beating this to death, but let's start with the HackerRank example:
message = [32 66 50 20 11 0 42 66 33 19 13 20 47 66 37 14 58 67 43 23 14 17 49 67 46 20 6 51 66 55 9 39 67 45 3 25 56 66 39 14 37 34 65 51 22 8 1 40 65 32 17 14 21 45 65 36 12 57 66 41 20 15 19 50 66 44 23 7 49 65 54 11 36 66 47 0 24 58 65 38 12 38]
Now, we divide message into three interleaved parts:
- message[0::3] = [32, 20, 42, 19, 47, 14, 43, 17, 46, 51, 9, 45, 56, 14, 65, 8, 65, 14, 65, 57, 20, 50, 23, 65, 36, 0, 65, 38] (indexes 0, 3, 6, 9, ...)
Most common for i=0: 65 - message[1::3] = [66, 11, 66, 13, 66, 58, 23, 49, 20, 66, 39, 3, 66, 37, 51, 1, 32, 21, 36, 66, 15, 66, 7, 54, 66, 24, 38] (indexes 1, 4, 7, 10, ...)
Most common for i=1: 66 - message[2::3] = [50, 0, 33, 20, 37, 67, 14, 67, 6, 55, 67, 25, 39, 34, 22, 40, 17, 45, 12, 41, 19, 44, 49, 11, 47, 58, 12] (indexes 2, 5, 8, 11, ...)
Most common for i=2: 67
We have identified the most common encrypted values for each key position: {65, 66, 67} and we assume these correspond to encrypted spaces. The space character ' ' has ASCII code 32. Since encryption was done by XOR, we can recover the key byte for each position by XORing the most common value with 32:
Key character at position 0: 65 ^ 32 = 97 ('a')
Key character at position 1: 66 ^ 32 = 98 ('b')
Key character at position 2: 67 ^ 32 = 99 ('c')
The key is "abc"
HackerRank version
HackerRank Project Euler 59: Encrypted text is input data vs an external file.
Python Source Code
import collections
input()
message = list(map(int, input().split()))
key = [collections.Counter(message[i::3]).most_common(1)[0][0]^ord(' ') for i in [0,1,2]]
print(''.join(map(chr, key)))