Project Euler Problem 98 Statement
By replacing each of the letters in the word CARE with $1$, $2$, $9$, and $6$ respectively, we form a square number: $1296 = 36^2$. What is remarkable is that, by using the same digital substitutions, the anagram, RACE, also forms a square number: $9216 = 96^2$. We shall call CARE (and RACE) a square anagram word pair and specify further that leading zeroes are not permitted, neither may a different letter have the same digital value as another letter.
Using words.txt (right click and 'Save Link/Target As...'), a 16K text file containing nearly two-thousand common English words, find all the square anagram word pairs (a palindromic word is NOT considered to be an anagram of itself).
What is the largest square number formed by any member of such a pair?
NOTE: All anagrams formed must be contained in the given text file.
Solution
HackerRank Project Euler Problem 98 overview:
Find the largest square number that can be formed by rearranging the digits of another square number, both having the same number of digits (anagram). The solution requires generating square numbers within a given range, grouping them based on digit frequency, and identifying the largest square in the group with the highest frequency.
Solution Summary
The
hash()
function ensures that all square numbers with the same set of digits (regardless of order) are grouped together. Thefreq[]
dictionary tracks how many square numbers share the same hash. By iterating over squares in increasing order and updating ans whenever a new maximum frequency is found, the script ensures that ans ends up being the largest square in the most populous group.
Solution Detail
1. Understanding the Inputs and Outputs
- N is the number of digits for which square numbers are to be considered.
- The output is the largest square number whose digits can be rearranged to form another square number with the same digit count.
2. Hash Function
def hash(s): return tuple(str(s).count(str(d)) for d in range(10))
The hash
function generates a unique fingerprint (or "signature") for a number based on its digit frequency. For example:
123 → (0, 1, 1, 1, 0, 0, 0, 0, 0, 0)
321 → (0, 1, 1, 1, 0, 0, 0, 0, 0, 0)
3. Calculate Square Numbers in the Digit Range
start, end = ceil(sqrt(10 ** (N - 1))), floor(sqrt(10 ** N))
The range of square numbers with \(N\) digits is determined as follows:
- \(\text{start}\): The smallest integer whose square has at least \(N\) digits.
- \(\text{end}\): The largest integer whose square has at most \(N\) digits.
For example, if \(N = 3\):
- The smallest 3-digit number is \(100\), and \(\sqrt{100} = 10\).
- The largest 3-digit number is \(999\), and \(\sqrt{999} \approx 31.6\), so the range is \(10\) to \(31\).
4. Use defaultdict
to Track Frequency of Hashes
freq, max_count, ans = defaultdict(int), 0, 0
freq
: Maps a hash to the number of square numbers that share it.- max_count: Keeps track of the maximum frequency observed.
- ans: Stores the square number with the highest frequency hash.
5. Iterate Through Square Numbers
for i in range(start, end + 1): h = hash(i * i) freq[h] += 1 if freq[h] > max_count: max_count, ans = freq[h], i * i
For each integer \(i\) in the range:
- Compute its square, \(i^2\).
- Generate the hash for \(i^2\).
- Increment the count for that hash in
freq
. - If this hash has been observed more frequently than any previous hash, update max_count and ans.
6. Output the Result
print(ans)
The result is the largest square number whose hash has the maximum frequency.
HackerRank version
HackerRank Project Euler 98: for each value of $N$, we wish to know the largest set of square anagrams for a number with $N$, (3 ≤ N ≤ 13) digits. Print out the largest number of this set. If the largest set is not unique, pick whichever one has the largest maximum element.
Python Source Code
from math import ceil, sqrt, floor
from collections import defaultdict
def hash(s): return tuple(str(s).count(str(d)) for d in range(10))
N = int(input())
start, end = ceil(sqrt(10 ** (N - 1))), floor(sqrt(10 ** N))
freq, max_count, ans = defaultdict(int), 0, 0
for i in range(start, end + 1):
h = hash(i * i)
freq[h] += 1
if freq[h] > max_count:
max_count, ans = freq[h], i * i
print(ans)