Project Euler Problem 22 Statement
Using names.txt
, a 46K text file containing over five–thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score.
For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938th name in the list. So, COLIN would obtain a score of 938 × 53 = 49714.
What is the total of all the name scores in the file?
Solution
Opening, reading and parsing the input file
The names consist of only uppercase characters A through Z and have the form:
"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER","MARIA", …
Text with a dashed underscore will highlight the code section being talked about when moused over or tapped.
The remote file, url
, is opened as a local file handle using the urllib.request
library’s function urlopen
.
from urllib.request import urlopen url = 'https://projecteuler.net/project/resources/p022_names.txt' with urlopen(url) as response: file_content = response.read().decode('utf-8') names = sorted(file_content[1:-1]).split('","'))
We read the file as one long string ignoring the first and last quote. The names are stored into a list by splitting the string at the quote–comma–quote (",") delimiter. This effectively removes all quotation marks and commas. The list is then sorted alphabetically into ascending order. The use of a list is necessary to solve the HackerRank version of this problem.
After processing, the list looks like this:
names: [AARON, ABBEY, ABBIE, ABBY, ABDUL, ABE, …]
Calculating name scores
The second part of the program uses two loops: one to select each name and the other to parse it into individual characters.
s = 0 for lineNumber, name in enumerate(names, start=1): s+= lineNumber * sum(ord(c)−64 for c in name)
The first loop uses the enumerate()
function which returns a tuple containing an incremental counter, lineNumber and the corresponding name value by iterating over the names[] list. By default, enumerate()
starts the counter at zero unless you redefine it using an optional second integer argument. An alternate form for the second argument can include the keyword start
to improve readability.
The second loop calculates the name score by iterating each character, c in the string variable name, summing its character values, and multiplying the sum by the ordinal position of the name in the list. Name scores are summed into s.
The ord()
function calculates the ASCII value for each character. An ASCII 'A' is 65, 'B' is 66, …, and 'Z' is 90 so we subtract 64 from the ASCII value to normalize the range [65, 90] to [1, 26] as required to calculate a name score.
Using Python’s lambda function and list comprehensions
The above code is refactored using a lambda function and list comprehensions. This is more Pythonic and a succinct way of expressing this algorithm.
score = lambda word: sum(ord(c)−64 for c in word)
s = sum(lineNumber*score(name) for lineNumber, name in enumerate(names, 1))
HackerRank version
Instead of printing a score for the whole file, HackerRank Project Euler 22 wants the score for individual names.
Python Source Code
from urllib.request import urlopen
url = 'https://projecteuler.net/project/resources/p022_names.txt'
with urlopen(url) as response:
file_content = response.read().decode('utf-8')
names = sorted(file_content[1:-1].split('","'))
score = lambda word: sum(ord(c)-64 for c in word)
s = sum(lineNumber*score(name) for lineNumber, name in enumerate(names, start=1))
print ("Total name score =", s)
Last Word
See also, Project Euler 42: Coded triangle numbers