Project Euler 22 Problem Statement

Project Euler 22: Using names.txt, a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score.

For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938th name in the list. So, COLIN would obtain a score of 938 × 53 = 49714.

What is the total of all the name scores in the file?

Solution

The names in the file consist of only uppercase characters A through Z and have the form:

"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER","MARIA",...

The remote file, named in string variable file_url, is opened as a local file handle using the urllib2 library’s function urlopen.

file_url = 'https://projecteuler.net/project/resources/p022_names.txt'
names = sorted(urllib2.urlopen(file_url).readline()[1:-1].split('","'))

The file is read as one long string ignoring the first and last quote. The names are stored into an array by splitting the string at the quote-comma-quote (‘”,”‘). This effectively removes all double quotation marks. The list is then sorted alphabetically into ascending order. The use of an array is necessary to solve the HakerRank Project Euler Problem 22 version of this problem.

After processing, the array looks like this:

names: ['AARON', 'ABBEY', 'ABBIE', 'ABBY', 'ABDUL', 'ABE', ...]

Note: the single quotes simply indicate string values in the array and are not part of the name.

The second part of the program uses two loops: one to select each name and the other to parse it into individual characters.

s = 0
for ln, name in enumerate(names, start=1):
    s+= ln * sum(ord(c)-64 for c in name)

The first loop uses the enumerate() function which returns a tuple containing an incremental counter (variable ln) and the corresponding name value obtained by iterating over the names array. By default, enumerate() starts the counter at 0 unless you redefine it using an optional second integer argument. The second argument can include the keyword start= to improve readability.

Calculating name scores

The second loop makes the name score calculation by iterating each character (string variable c) in the string variable name, summing its character values, and multiplying the sum by the ordinal position of the name in the array. Name scores are accumulated into variable s.

The ord() function calculates the ASCII value for each character. An ASCII ‘A’ is 65, ‘B’ is 66, …, and ‘Z’ is 90 so we subtract 64 from the ASCII value to normalize the range [65,90] to [1,26] as required to calculate a name score.

Using list comprehension with a lambda function can make the code more succinct.

score = lambda word: sum(ord(c)-64 for c in word)
s = sum(ln*score(name) for ln, name in enumerate(names, 1))

HackerRank version

The HackerRank version calculates individual name scores as a batch of queries, instead of a score for the entire name file.

Python 2.7 Source