Scattered Thoughts: 2010

Tuesday, November 16, 2010

The next Project Euler problem I want to tackle is number 22:

What is the total of all the name scores in the file of first names?

The file in question contains data that looks like this:

"MARY","PATRICIA","LINDA","BARBARA",...

It's a nice list but I will need to clean it up to get it into a form that is easy to process. To that end I define a couple of helper functions. These allow me to use .NET's Regex class in a more functional manner:

open System.Text.RegularExpressions

let split pattern input =
    (new Regex(pattern)).Split(input)

let replace pattern (replacement:string) input =
    (new Regex(pattern)).Replace(input, replacement)

The value for a name is computed by adding the values for each letter ( a = 1, b = 2, c = 3, etc.). This is pretty easy in F# because a string can be treated as a sequence of characters. Thus the value of a name can be computed like this:

let value (name:string) = 
    name |> Seq.map (fun c -> (int)c - (int)'A' + 1) |> Seq.sum

All the names in the file are already in uppercase so we don't have to do any normalisation there.

Once all the names have a value I need to compute a score for each name. This score is the name's value multiplied by its position in the alphabetically sorted list.

So, in words, here's what we do:

Read in the names.
Sanitize them by getting rid of the extraneous quotes.
Split the resulting string into an array of name strings.
Sort the array.
Compute the score for each name.
Sum the resulting scores.
Display the result.

Using the previously defined functions this algorithm translates directly into F# code:

open System.IO


File.ReadAllText(@"e:\Project Euler\names.txt") 
    |> replace "\"" "" 
    |> split ","
    |> Array.sort 
    |> Array.mapi (fun i name -> (i + 1)*(value name)) 
    |> Array.sum 
    |> printfn "total = %A"

The twenty-first Project Euler problem is:

Evaluate the sum of all amicable pairs under 10000.

An amicable pair is two numbers such that the sum of the proper divisors of the first number adds up to the second and vice versa. Each member of the pair is considered to be an amicable number. As always Wikipedia has a good article on the subject.

Following the suggestion from the statement of the problem I define a function d(n) that computes the sum of n's divisors:

let d n = 
    {1..n/2} |> Seq.filter (fun i -> n%i = 0) |> Seq.sum

Now I need to test whether a given number is amicable:

let amicable n = 
    let sum = d n
    (sum <> n) && (d sum = n)

Note that we exclude so called perfect numbers where the sum of n's divisors is n.

Finally, I can compute the sum of all the amicable numbers under 10,000 as follows:

{1..10000} |> Seq.filter (fun i -> amicable i) |> Seq.sum |> printfn "total = %A"

Thursday, November 11, 2010

Numerically, the next Project Euler problem is number 20:

Find the sum of digits in 100!

Hmmm... what to do, what to do? Well, as always, lets start with the helper functions. 100! is a very big number. Thus we will have to use F#'s BigInt type. Here's a factorial function:

let factorial n = Seq.fold ( * ) 1I [1I .. n]

Of course, F# already has a factorial function for BigInts but since this is a learning exercise I felt that I had to write my own.

Next I want to turn a big number into a series of digits. This is the function I used for that purpose in my solution to problem number 16:

let rec digits n =
    seq {
        if n > 0I then
            yield n%10I
            yield! digits (n/10I)
    }

Finally we can put these together to create a solution:

factorial 100I |> digits |> Seq.sum |> printfn "sum = %A"

On my computer I get the answer in less time than it takes to threaten the monitor with my fist.

Wednesday, November 10, 2010

The next Project Euler problem on my list is number 19:

How many Sundays fell on the first of the month during the twentieth century?

For this problem it seemed to me that the easiest attack would be to create a sequence containing all the first days of each month in the given range and then filter it by testing whether the day in question was a Sunday. To make this work I would need a way of calculating the day of the week from the date. Fortunately, once again Wikipedia came to the rescue.

Following the algorithm from this article we define a few helper functions. To calculate the century we say:

let century y = 2 * (3 - (y/100)%4)

To determine whether we have a leap year:

let leap y = 
    if y%400 = 0 
        then true 
        else 
            if y%100 = 0 
                then false 
                else 
                    if y%4 = 0 
                        then true 
                        else false

For the month table I used a pattern match expression which takes the month number and whether it's a leap year or not:

let month l m =
    match m with
    |  0 -> if l then 6 else 0
    |  1 -> if l then 2 else 3
    |  2 -> 3
    |  3 -> 6
    |  4 -> 1
    |  5 -> 4
    |  6 -> 6
    |  7 -> 2
    |  8 -> 5
    |  9 -> 0
    | 10 -> 3
    | 11 -> 5

Note that this match generates a warning to the effect that there are cases which may not be covered:

match m with
----------^
stdin(19,11): warning FS0025: Incomplete pattern matches on this expression. For example, the value '12' may indicate a case not covered by the pattern(s).

In this case I am controlling all the inputs and I know that I would never make a mistake so I decided not to worry about it.

Next I needed to compute the year value:

let year y = y%100 + (y%100)/4

With these helpers, given a date tuple in the form (y, m, d) the day of the week it falls on is given by:

let day (y, m, d) = (century y + year y + month (leap y) m + d)%7

In this scheme Sunday is day 0.

The list of dates I want to examine are given by this sequence:

let dates = 
    seq { 
        for y in 1901..2000 do
            for m in 0..11 do
                yield (y, m, 1) }

Putting the pieces together to solve the puzzle:

dates |> Seq.filter (fun x -> day x = 0) |> Seq.length |> printfn "count = %A"

Tuesday, November 09, 2010

Returning to my irregularly scheduled series on Project Euler solutions I'd like to look at problem 18:

Find the maximum sum travelling from the top of the triangle to the base.

Here is a little example:

3 2

4 1 6

I have marked the maximum sum with red numbers. For a small triangle of numbers like this you could easily brute force the solution and just check all paths from the top to the bottom. For a large triangle, that would be too labour intensive. A better solution, I think, is to coalesce the rows of the triangle working from the bottom up. So, in this simple example, you would first replace the second row with the maximum value possible by adding a number in that row to the two numbers below it in the triangle. In this case, 3 + 4 > 3 + 1 so you would replace 3 by 7 and 2 + 6 > 2 + 1 so you'd replace 2 by 8. That would reduce the triangle to this:

1

7 8

Performing the same step on the reduced triangle would give you the final answer (since 1 + 8 > 1 + 7).

9

To solve this problem I put the data into an array of arrays:

let data =
    [|
    [|04; 62; 98; 27; 23; 09; 70; 98; 73; 93; 38; 53; 60; 04; 23|];
    [|63; 66; 04; 68; 89; 53; 67; 30; 73; 16; 69; 87; 40; 31|];
    [|91; 71; 52; 38; 17; 14; 91; 43; 58; 50; 27; 29; 48|];
    [|70; 11; 33; 28; 77; 73; 17; 78; 39; 68; 17; 57|];
    [|53; 71; 44; 65; 25; 43; 91; 52; 97; 51; 14|];
    [|41; 48; 72; 33; 47; 32; 37; 16; 94; 29|];
    [|41; 41; 26; 56; 83; 40; 80; 70; 33|];
    [|99; 65; 04; 28; 06; 16; 70; 92|];
    [|88; 02; 77; 73; 07; 63; 67|];
    [|19; 01; 23; 75; 03; 34|];
    [|20; 04; 82; 47; 65|];
    [|18; 35; 87; 10|];
    [|17; 47; 82|];
    [|95; 64|];
    [|75|]
    |]

You will notice that I inverted the triangle so that the widest part is at the bottom. I wanted my algorithm to proceed from the start of the array to end and not vice versa.

Next I defined a helper function:

let max x y = if x < y then y else x

Now I want a function that will coalesce two arrays into one:

let coalesce (a:int[]) (b:int[]) =
    [| for i in 0..b.Length-1 -> max (a.[i]+b.[i]) (a.[i+1]+b.[i]) |]

So, the idea here is that two one-dimensional arrays are passed in. The first one is assumed to be one longer than the second (in production code you'd probably want to enforce the precondition). This function creates a new one-dimensional array with the same length as the second input array. Each entry in the new array is the max of the entry in the second array with either of the numbers above it in the first array.

Finally, I want to apply the coalesce operation to the entire data array:

data |> Array.fold coalesce (Array.zeroCreate (data.[0].Length+1)) |> printfn "%A"

I have created an array with length one greater than the maximum line in the data array and I'm using that as the state accumulator passed to the fold function. At each step in the fold operation it will be replaced by an array one entry smaller until, when it's done, there will just be an array with a single element which will be the final answer.

I should note that one potential problem with this solution arises if you want to know the actual path that led to the answer.

Wednesday, November 03, 2010

Many years ago, in the late 1980's when I lived in Vancouver, I took night school courses to learn conversational Cantonese. I stopped when I moved to California in 1990 and all the vocabulary I acquired then has been slowly eroding. However, my previous post on spelling out numbers reminded me of how one says the numbers in Cantonese. Here are the numbers with their Chinese character and Cantonese pronunciation:

one	1	yāt	一
two	2	yih	二
three	3	sàam	三
four	4	sei	四
five	5	nǵh	五
six	6	luhk	六
seven	7	chat	七
eight	8	baat	八
nine	9	gáu	九
ten	10	sahp	十

Subsequent numbers are built logically from these. For example, eleven (11) is 十一, twenty-one (21) is 二十一 and thirty-one (31) is 三十一. In Cantonese you would pronounce 三十一 as "sàam sahp yāt".

A program equivalent to the one that counted the number of letters in the English spelling of numbers would probably count the strokes of the Chinese numbers. It would have a lot fewer special cases than the program I presented earlier.

I use the Yale Romanization for Cantonese since that's what I was taught long back. There is a lot of information about Chinese Numerals online.

Tuesday, May 18, 2010

Project Euler's seventeenth problem wants to know:

How many letters would be needed to write all the numbers in words from 1 to 1000?

For example, in the British style, one would write out 256 as "two hundred and fifty-six". We are told not to count spaces and hyphens so 21 letters are needed for this number.

My approach is to decompose a number into a tuple containing the number of thousands, hundreds, tens and ones in the number. However, to make things simpler for me I chose to special case all the numbers under 20 since their spelling is a irregular. That is, instead of saying something like "ten plus one" we say "eleven". It is easier to break that out at this stage in the problem.

Here is my F# code to decompose a number:

let decompose x = 
    let thousands x = (x/1000)*1000
    let hundreds  x = ((x - thousands x)/100)*100
    let tens      x = 
        let temp = ((x - thousands x - hundreds x)/10)*10
        match temp with
        | _ when 0 < temp && temp < 20 -> 0
        | _                            -> temp
    let ones      x = (x - thousands x - hundreds x - tens x)
    (thousands x, hundreds x, tens x, ones x)

Just to show how this works, here is the decomposition of some example numbers:

> decompose 256;;

val it : int * int * int * int = (0, 200, 50, 6)

> decompose 512;;

val it : int * int * int * int = (0, 500, 0, 12)

The next piece of the puzzle is to take a section of the decomposed number and return the number of letters used to write it out:

let letters x =
    match x with
    |    1 ->  3    // one
    |    2 ->  3    // two
    |    3 ->  5    // three
    |    4 ->  4    // four
    |    5 ->  4    // five
    |    6 ->  3    // six
    |    7 ->  5    // seven
    |    8 ->  5    // eight
    |    9 ->  4    // nine
    |   10 ->  3    // ten
    |   11 ->  6    // eleven
    |   12 ->  6    // twelve
    |   13 ->  8    // thirteen
    |   14 ->  8    // fourteen
    |   15 ->  7    // fifteen
    |   16 ->  7    // sixteen
    |   17 ->  9    // seventeen
    |   18 ->  8    // eighteen
    |   19 ->  8    // nineteen
    |   20 ->  6    // twenty
    |   30 ->  6    // thirty
    |   40 ->  5    // forty
    |   50 ->  5    // fifty
    |   60 ->  5    // sixty
    |   70 ->  7    // seventy
    |   80 ->  6    // eighty
    |   90 ->  6    // ninety
    |  100 -> 10    // one hundred
    |  200 -> 10    // two hundred
    |  300 -> 12    // three hundred
    |  400 -> 11    // four hundred
    |  500 -> 11    // five hundred
    |  600 -> 10    // six hundred
    |  700 -> 12    // seven hundred
    |  800 -> 12    // eight hundred
    |  900 -> 11    // nine hundred
    | 1000 -> 11    // one thousand
    |    _ ->  0

Now, given a decomposed number, compute the total number of letters used to write it out:

let length (a, b, c, d) =
    match (a, b, c, d) with 
    | (_, _, 0, 0) -> letters a + letters b
    | (0, 0, _, _) -> letters c + letters d
    | (_, _, _, _) -> letters a + letters b + letters c + letters d + 3

Note that the 3 in the last case is to account for the word "and".

Finally, to find out how many letters are needed to write out the numbers from one to 1000:

{1..1000} |> Seq.map decompose |> Seq.map length |> Seq.sum |> printfn "count = %A"

Monday, May 17, 2010

Speaking of goofy pictures, I should mention that I used AutoDesk // LABS' Project // DRAW to create the images in my previous post.

Problem 16 of Project Euler:

What is the sum of the digits of the number 2¹⁰⁰⁰?

This is a pretty straightforward problem to solve. I don't even get to draw any goofy pictures to make the logic clear.

The idea is to compute the number, split it into its individual digits and add them all together. Given a number n here's the F# code to extract each digit:

let rec digits n =
    seq {
        if n > 0I then
            yield n%10I
            yield! digits (n/10I)
    }

The desired answer is:

digits (2I ** 1000) |> Seq.sum |> printfn "%A"

Friday, May 14, 2010

Problem 15 of Project Euler asks:

Starting in the top left corner in a 20 by 20 grid, how many routes are there to the bottom right corner?

Without getting into complicated proofs I want to present some ideas that lead me to a solution for this problem.

First, suppose that we want to compute the number of paths from point A to point C and we know that all paths pass through point B. Say that a quick count shows there are m paths from A to B and n paths from B to C. Here is my goofy drawing showing the paths:

I think it's pretty obvious that there are mxn paths from A to C through B. Keep this in mind, it will be useful later on.

In general, when tackling a problem like this, I like to start by examining simpler versions of the same problem to see if I can come up with a general solution which I can then apply to the specific problem. I started by thinking about the problem on a 1 by 1 grid. After much thought it occurred to me that I could use a symmetry argument. If I consider all the midpoints along the line from the top right corner to the lower left corner then the number of paths from the start to the midpoint must be the same as the number of paths from the midpoint to the end. Here's a picture showing the paths to one of the midpoints in the grid:

In this diagram, the dotted line shows the axis of symmetry of the square. There are two midpoints (although I've only labelled the first). There is only one path from A to B and so there is one path from B to C. To find the total number of paths from A to C we add the number of paths from A to C through each midpoint. In this case it is 1² + 1² = 2. We are using the first idea mentioned above in this special, symmetrical, case.

Fair enough. Lets take a look at the 2 by 2 grid:

In this case we have three midpoints and I am just showing that there are two paths to the center one. By inspection we can see that there is only one path from A to the other midpoints and so the total number of path is 1² + 2² + 1² = 6. This, by the way, is a nice sanity check since it matches the answer given for a 2 by 2 grid in the statement of the problem.

I'm already starting to see a pattern here but just to really nail it down let's think about the 3 by 3 grid:

Now we have four midpoints to consider. As before, there is one path to each of the outside midpoints and we can see three paths to the each of the inside midpoints. We compute the total number of paths from A to C as 1² + 3² + 3² + 1² = 20.

Suppose we just look at the number of paths to the midpoints for each of the three cases I have presented here. Stacking them up in a provocative manner we see this:

1 1

1 2 1

1 3 3 1

To me this looks like Pascal's Triangle. And since I don't have to prove anything I'm just going to assume that it is. So, if I was to count the paths to each midpoint in a 4 by 4 grid I would see:

1 4 6 4 1

And so on. If I get the right answer then I'll know that my hunch is correct.

According to the Wikipedia article referenced above, the i^th entry in the n^th row of Pascal's triangle is:

So, with this formula, we have all the pieces necessary to solve this problem. To spell it out, the number of paths from the top left corner of the grid to the bottom right corner (with no backtracking) is sum of the squares of the coefficients of the n^th row of Pascal's triangle. To compute this number I use a couple of helper functions. Here is a factorial:

let factorial n = Seq.fold (*) 1I {1I..n}

Note that I could have used the built in factorial function but since I am using these problems as a learning experience it makes sense to write my own (or at least it seemed to make sense at the time!).

Here's how we can compute n choose i:

let choose n i = (factorial n)/((factorial i)*(factorial (n-i)))

With this we can create a sequence consisting of the entries in the n^th row of Pascal's triangle:

let pascal n = 
    seq {
        for i in 0I..n -> choose n i
    }

The final step is to perform the actual computation:

pascal 20I |> Seq.map (fun i -> i*i) |> Seq.sum |> printfn "sum = %A"

Done!

Given the right insight the actual code was pretty easy.

Wednesday, May 05, 2010

Time for another Project Euler post. Problem fourteen asks you to:

Find the longest sequence using a starting number under one million.

Given a start number x_n the next entry in the sequence is given by:

You can find a nice discussion of this sequence in Wikipedia as the Collatz Problem. In F# this code will generate the sequence from a given starting value:

let rec sequence n = 
    seq {
        yield n;
        match n with
        | 1ul -> ()
        | _ when n%2ul = 0ul -> yield! sequence (n/2ul)
        | _ when n%2ul = 1ul -> yield! sequence (3ul*n + 1ul)
    }

Then I can create a tuple containing the sequence starting value and the sequence length as follows:

let length n =
    (sequence n |> Seq.length, n)

So when I want to find the longest sequence I can write:

seq { 1ul..999999ul} |> Seq.map length |> Seq.max |> printfn "longest = %A"

Interestingly, the Seq.max function will just use the first entry in the tuple for its comparison. This means I don't have to clutter my code with special handling to extract the length information from the sequence of tuples. Obviously I got lucky when I put together the tuple and had the length come before the seed value.

Monday, April 19, 2010

The thirteenth Project Euler problem is:

Find the first ten digits of the sum of one-hundred 50-digit numbers.

I used F#'s built in BigInt data type. The array of 50 digit numbers looks like this:

let numbers = [| 
    37107287533902102798797998220837590246510135740250I;
    46376937677490009712648124896970078050417018260538I;
    74324986199524741059474233309513058123726617309629I;
    91942213363574161572522430563301811072406154908250I;
    23067588207539346171171980310421047513778063246676I;
    89261670696623633820136378418383684178734361726757I;
    28112879812849979408065481931592621691275889832738I;
    44274228917432520321923589422876796487670272189318I;
    47451445736001306439091167216856844588711603153276I;
    70386486105843025439939619828917593665686757934951I;
    62176457141856560629502157223196586755079324193331I;
    64906352462741904929101432445813822663347944758178I;
    92575867718337217661963751590579239728245598838407I;
    58203565325359399008402633568948830189458628227828I;
    80181199384826282014278194139940567587151170094390I;
    35398664372827112653829987240784473053190104293586I;
    86515506006295864861532075273371959191420517255829I;
    71693888707715466499115593487603532921714970056938I;
    54370070576826684624621495650076471787294438377604I;
    53282654108756828443191190634694037855217779295145I;
    36123272525000296071075082563815656710885258350721I;
    45876576172410976447339110607218265236877223636045I;
    17423706905851860660448207621209813287860733969412I;
    81142660418086830619328460811191061556940512689692I;
    51934325451728388641918047049293215058642563049483I;
    62467221648435076201727918039944693004732956340691I;
    15732444386908125794514089057706229429197107928209I;
    55037687525678773091862540744969844508330393682126I;
    18336384825330154686196124348767681297534375946515I;
    80386287592878490201521685554828717201219257766954I;
    78182833757993103614740356856449095527097864797581I;
    16726320100436897842553539920931837441497806860984I;
    48403098129077791799088218795327364475675590848030I;
    87086987551392711854517078544161852424320693150332I;
    59959406895756536782107074926966537676326235447210I;
    69793950679652694742597709739166693763042633987085I;
    41052684708299085211399427365734116182760315001271I;
    65378607361501080857009149939512557028198746004375I;
    35829035317434717326932123578154982629742552737307I;
    94953759765105305946966067683156574377167401875275I;
    88902802571733229619176668713819931811048770190271I;
    25267680276078003013678680992525463401061632866526I;
    36270218540497705585629946580636237993140746255962I;
    24074486908231174977792365466257246923322810917141I;
    91430288197103288597806669760892938638285025333403I;
    34413065578016127815921815005561868836468420090470I;
    23053081172816430487623791969842487255036638784583I;
    11487696932154902810424020138335124462181441773470I;
    63783299490636259666498587618221225225512486764533I;
    67720186971698544312419572409913959008952310058822I;
    95548255300263520781532296796249481641953868218774I;
    76085327132285723110424803456124867697064507995236I;
    37774242535411291684276865538926205024910326572967I;
    23701913275725675285653248258265463092207058596522I;
    29798860272258331913126375147341994889534765745501I;
    18495701454879288984856827726077713721403798879715I;
    38298203783031473527721580348144513491373226651381I;
    34829543829199918180278916522431027392251122869539I;
    40957953066405232632538044100059654939159879593635I;
    29746152185502371307642255121183693803580388584903I;
    41698116222072977186158236678424689157993532961922I;
    62467957194401269043877107275048102390895523597457I;
    23189706772547915061505504953922979530901129967519I;
    86188088225875314529584099251203829009407770775672I;
    11306739708304724483816533873502340845647058077308I;
    82959174767140363198008187129011875491310547126581I;
    97623331044818386269515456334926366572897563400500I;
    42846280183517070527831839425882145521227251250327I;
    55121603546981200581762165212827652751691296897789I;
    32238195734329339946437501907836945765883352399886I;
    75506164965184775180738168837861091527357929701337I;
    62177842752192623401942399639168044983993173312731I;
    32924185707147349566916674687634660915035914677504I;
    99518671430235219628894890102423325116913619626622I;
    73267460800591547471830798392868535206946944540724I;
    76841822524674417161514036427982273348055556214818I;
    97142617910342598647204516893989422179826088076852I;
    87783646182799346313767754307809363333018982642090I;
    10848802521674670883215120185883543223812876952786I;
    71329612474782464538636993009049310363619763878039I;
    62184073572399794223406235393808339651327408011116I;
    66627891981488087797941876876144230030984490851411I;
    60661826293682836764744779239180335110989069790714I;
    85786944089552990653640447425576083659976645795096I;
    66024396409905389607120198219976047599490197230297I;
    64913982680032973156037120041377903785566085089252I;
    16730939319872750275468906903707539413042652315011I;
    94809377245048795150954100921645863754710598436791I;
    78639167021187492431995700641917969777599028300699I;
    15368713711936614952811305876380278410754449733078I;
    40789923115535562561142322423255033685442488917353I;
    44889911501440648020369068063960672322193204149535I;
    41503128880339536053299340368006977710650566631954I;
    81234880673210146739058568557934581403627822703280I;
    82616570773948327592232845941706525094512325230608I;
    22918802058777319719839450180888072429661980811197I;
    77158542502016545090413245809786882778948721859617I;
    72107838435069186155435662884062257473692284509516I;
    20849603980134001723930671666823555245252804609722I;
    53503534226472524250874054075591789781264330331690I;
    |]

Once we have the numbers in an array it is easy to add them up:

numbers |> Array.sum |> printfn "total = %A"

So easy it feels like cheating. Formatting the data was the most time consuming part of solving this problem.

In the previous post I talked about computing the divisor count of an integer given its prime factorisation. I also said I didn't have to prove the formula. However, I find I just can't leave it alone. Why does that formula work? I will sketch out a brief rationale (that falls far short of a formal proof).

I think it is pretty clear that any number that divides into an integer has to be made up from some mixture of the integer's prime factors. In other words, if a prime p is not a factor of an integer n then p will not divide evenly into n. Also, if p is a factor of n and it divides into n α times then p^α+1 will not divide into n.

So we can build all divisors of n by multiplying all of n's prime factors and varying their exponent between 0 and allowed exponent for that factor. That is, for example, if n = p^α then the divisors of n are {p⁰, p¹, ... p^α}. This set contains α+1 elements.

If n has multiple prime factors then you can multiply them all together with all acceptable permutations of exponents to create the divisor set. In the field of combinatorics this is selection with replacement (pdf). If you have ω(n) positions to fill and each position has α_i+1 possible values then the total number of combinations is the product of the possible values. That is:

I feel much better now.

Friday, April 16, 2010

I think I'll post my answer to Problem 12 of Project Euler next:

What is the value of the first triangle number to have over five hundred divisors?

What exactly is a triangle number? It is a number given by this formula:

For this problem we actually want to generate a sequence of triangular numbers. It is most efficient to generate the n^th number by adding n to the (n-1)^th number. We can capture that in F# as follows:

let triangles =
    let rec inner m n =
        seq {
            let p = m + n
            yield p
            yield! inner p (n+1L)
        }
    seq {
        yield! inner 0L 1L
    }

The next question is how do we find the number of divisors of the number n? The brute force way would be to try all values up to n/2 and find the ones which divide evenly into the target number. I figured there had to be a better way. A little bit of research on the web showed me that if you know the prime factorization for a number then you can easily compute the number of divisors. Suppose that the a number n can be factored as follows:

where ω(n) is the number of distinct prime factors of n. Then the number of divisors of n is given by:

Fortunately, I'm no longer a math major and so I don't have to prove that this is true. For my purposes it's enough to just try it out and see if it seems reasonable. So, we can say that 12 = 2² x 3. According to the formula then the number of divisors should be (2 + 1) x (1 + 1) = 6. Listing the divisors we find the following set: {1, 2, 3, 4, 6, 12}. Good enough for me.

To solve this Project Euler problem I need to be able to factor an arbitrary number. The simplest thing is to try all the primes up to the square root of the target number and see if they divide into it evenly. As before I need a sequence of primes. Here's the code:

let primes = 
    let rec sieve n table =
        let rec update n p table =
            let n = n+p
            match Map.tryFind n table with
            | None              -> Map.add n p table
            | Some(prime)       -> update n p table
        seq {
            match Map.tryFind n table with
            | None          -> 
                yield n
                yield! sieve (n+1L) (table |> Map.add (n*n) n)
            | Some(prime)   ->
                yield! sieve (n+1L) (Map.remove n table |> update n prime)
        } 
    sieve 2L Map.empty 

I think this will be the last time I display this code. From now on just assume that when needed I can generate a list of primes with a lazy sequence.

I also need a few helper functions. Here's an integer square root:

let root (n:int64) =
    int64 (sqrt (float n))

And an integer power function (tail recursive just for fun):

let power p e  =
    let rec inner p e acc =
        if e = 0L then
            acc
        else
            inner p (e-1L) p*acc
    inner p e 1L

Finally here's a function to compute the number of times a prime divides into a number:

let exponent p n =
    let rec inner p n acc =
        if n%p = 0L then
            inner p (n/p) (acc+1L)
        else
            acc
    inner p n 0L

The next step is to compute a sequence which consists of the prime factors of a number along with each factor's exponent. Here's what I came up with:

let rec factors n = 
    seq {
        if n > 1L then 
            let divisor = 
                primes 
                    |> Seq.takeWhile (fun k -> k <= root n) 
                    |> Seq.tryFind (fun k -> n%k = 0L)
            match divisor with
            | None      -> yield (n, 1L)
            | Some(k)   ->
                let e = exponent k n
                yield (k, e)
                yield! factors (n/power k e)
    }

There's some inefficiency here in that each time we recurse we start the sequence of primes over again. I'm not losing any sleep over it...

Given all this we can now compute the number of divisors of a number:

let divisors n =
    factors n 
        |> Seq.map (fun k -> snd k) 
        |> Seq.fold (fun acc e -> acc*(e+1L)) 1L 

The final step is to find the first entry in the triangular number sequence which has more than 500 divisors.

triangles 
    |> Seq.tryFind (fun n -> divisors n > 500L) 
    |> printfn "%A"

It was nice to revisit this solution while putting together this post. My original solution was a bit hard to read. I think it is much clearer now.

Thursday, April 08, 2010

The next book I want to classify is "Editing Early Music" by John Caldwell (ISBN 0-19-816544-7). The publisher, Oxford University Press, claims that it should be filed at 781.2 Elements of Music but this doesn't seem right to me. I'm not sure which edition of the Dewey Classification system they are using, perhaps this number made more sense before the great revision. Looking at the revised 780 schedule I found 780.149 Editing Music. This seems to be what I'm looking for. There is also a notation under 780.148 Musical notation, abbreviations, symbols that says "Class transcription from one form of notation to another in 780.149". This again indicates that I'm on the right track (this book is all about transforming early music notation into a form that can be understood by a modern musician).

At this point I could easily say I'm done but the Dewey system does allow further aspects or facets of the work to be captured. In this case I think you could add 780.902 500 - 1449 to the base number to indicate that the book is relevant to early music. The way you do this is still a little unclear to me. I believe that you add a facet indicator (which can be either 0 or 1) to the base number and then append the number after the decimal place in the facet. This would yield either 780.1490902 or 780.1491902. I think this would be fine in a large library with a ton of books on the same subject. For my library (with maybe 50 books and a couple hundred scores) I think it makes more sense to use the shorter classification number and then sort all the books in that class by Cutter number.

Update: I now believe that 0 is used to indicate a standard subdivision and 1 indicates a facet for a built number. So, in the example above, 780.1491902 would be the correct class number. Please correct me if I'm wrong.

Friday, April 02, 2010

What is the Dewey classification number for "The Cambridge Companion to the recorder" edited by John Mansfield Thomson (ISBN 0-521-35816-7)? According to the publisher it is 788.36 from 788.36 Recorders. At the Cambridge website they list the call number like this:

Dewey number: 788.3/6
Dewey version: 20

Knowing that they are using the revised 780 schedule means I can use their number directly without having to calculate it myself. Although, in this case I did, because working backwards from a known good value helps me to understand the classification process.

Problem 11 of the Project Euler:

What is the greatest product of four numbers on the same straight line in the 20 by 20 grid?

The problem description lays out an array of two digit integers where any group of 4 numbers either horizontally or vertically or diagonally must be evaluated to find the maximum product. My algorithms kept banging into the edges of the array (which caused ugly code) until I realised that I could embed their array inside a larger one and then just look at a subset of it. The four numbers I would look at in each step could include the padding and, if the pad number was zero, the maximum product would not be affected. I'd have to examine a few more products than if I handled the edges correctly but that's just a minor inefficiency.

With that in mind I defined the array as follows. The actual data starts at (3,3):

let data = 
    [|
      [|00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00|];
      [|00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00|];
      [|00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00|];
      [|00; 00; 00; 08; 02; 22; 97; 38; 15; 00; 40; 00; 75; 04; 05; 07; 78; 52; 12; 50; 77; 91; 08; 00; 00; 00|];
      [|00; 00; 00; 49; 49; 99; 40; 17; 81; 18; 57; 60; 87; 17; 40; 98; 43; 69; 48; 04; 56; 62; 00; 00; 00; 00|];
      [|00; 00; 00; 81; 49; 31; 73; 55; 79; 14; 29; 93; 71; 40; 67; 53; 88; 30; 03; 49; 13; 36; 65; 00; 00; 00|];
      [|00; 00; 00; 52; 70; 95; 23; 04; 60; 11; 42; 69; 24; 68; 56; 01; 32; 56; 71; 37; 02; 36; 91; 00; 00; 00|];
      [|00; 00; 00; 22; 31; 16; 71; 51; 67; 63; 89; 41; 92; 36; 54; 22; 40; 40; 28; 66; 33; 13; 80; 00; 00; 00|];
      [|00; 00; 00; 24; 47; 32; 60; 99; 03; 45; 02; 44; 75; 33; 53; 78; 36; 84; 20; 35; 17; 12; 50; 00; 00; 00|];
      [|00; 00; 00; 32; 98; 81; 28; 64; 23; 67; 10; 26; 38; 40; 67; 59; 54; 70; 66; 18; 38; 64; 70; 00; 00; 00|];
      [|00; 00; 00; 67; 26; 20; 68; 02; 62; 12; 20; 95; 63; 94; 39; 63; 08; 40; 91; 66; 49; 94; 21; 00; 00; 00|];
      [|00; 00; 00; 24; 55; 58; 05; 66; 73; 99; 26; 97; 17; 78; 78; 96; 83; 14; 88; 34; 89; 63; 72; 00; 00; 00|];
      [|00; 00; 00; 21; 36; 23; 09; 75; 00; 76; 44; 20; 45; 35; 14; 00; 61; 33; 97; 34; 31; 33; 95; 00; 00; 00|];
      [|00; 00; 00; 78; 17; 53; 28; 22; 75; 31; 67; 15; 94; 03; 80; 04; 62; 16; 14; 09; 53; 56; 92; 00; 00; 00|];
      [|00; 00; 00; 16; 39; 05; 42; 96; 35; 31; 47; 55; 58; 88; 24; 00; 17; 54; 24; 36; 29; 85; 57; 00; 00; 00|];
      [|00; 00; 00; 86; 56; 00; 48; 35; 71; 89; 07; 05; 44; 44; 37; 44; 60; 21; 58; 51; 54; 17; 58; 00; 00; 00|];
      [|00; 00; 00; 19; 80; 81; 68; 05; 94; 47; 69; 28; 73; 92; 13; 86; 52; 17; 77; 04; 89; 55; 40; 00; 00; 00|];
      [|00; 00; 00; 04; 52; 08; 83; 97; 35; 99; 16; 07; 97; 57; 32; 16; 26; 26; 79; 33; 27; 98; 66; 00; 00; 00|];
      [|00; 00; 00; 88; 36; 68; 87; 57; 62; 20; 72; 03; 46; 33; 67; 46; 55; 12; 32; 63; 93; 53; 69; 00; 00; 00|];
      [|00; 00; 00; 04; 42; 16; 73; 38; 25; 39; 11; 24; 94; 72; 18; 08; 46; 29; 32; 40; 62; 76; 36; 00; 00; 00|];
      [|00; 00; 00; 20; 69; 36; 41; 72; 30; 23; 88; 34; 62; 99; 69; 82; 67; 59; 85; 74; 04; 36; 16; 00; 00; 00|];
      [|00; 00; 00; 20; 73; 35; 29; 78; 31; 90; 01; 74; 31; 49; 71; 48; 86; 81; 16; 23; 57; 05; 54; 00; 00; 00|];
      [|00; 00; 00; 01; 70; 54; 71; 83; 51; 54; 69; 16; 92; 33; 48; 61; 43; 52; 01; 89; 19; 67; 48; 00; 00; 00|];
      [|00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00|];
      [|00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00|];
      [|00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00; 00|];
    |]

Now that I have the array data in place I can create a sequence consisting of the products of four numbers in various directions. I chose to look at the four numbers to the left of the current number, down diagonally to the left, straight down and down diagonally to the right. See the coloured numbers in the above array. That's all that's needed to get every possibly four digit product. I created a sequence from the array data that looked like this:

let products = 
    seq {
        for i in 3..22 do
            for j in 3..22 do
                yield (data.[i].[j] * data.[i-1].[j  ] * data.[i-2].[j  ] * data.[i-3].[j  ])
                yield (data.[i].[j] * data.[i-1].[j+1] * data.[i-2].[j+2] * data.[i-3].[j+3])
                yield (data.[i].[j] * data.[i  ].[j+1] * data.[i  ].[j+2] * data.[i  ].[j+3])
                yield (data.[i].[j] * data.[i+1].[j+1] * data.[i+2].[j+2] * data.[i+3].[j+3])
    }

A simple chain of operations finds the maximum product:

products |> Seq.max |> printfn "maximum = %A"

Wednesday, March 31, 2010

A Tenor's Look at Stravinsky's Symphony of Psalms

Igor Stravinsky was, in my opinion, the coolest composer of the twentieth century. Whenever I think of him and his music I'm reminded of that famous photograph showing him speeding his motorcycle (a '57 Vincent) through the traffic circling Paris's L'Arc de Triomphe - the wind sweeping his hair back, a cigarette clenched between his teeth. His eyes squint at the photographer. And who, I wonder, is the blonde riding pillion whose face is just barely visible behind his right shoulder, her arms wrapped around his black leather jacket, hanging on for dear life? The man was cool, no doubt about it.

I have to wonder how the composer of the Rite of Spring (completed in 1913) reacted to the earnest but amateur music in which he was suddenly immersed when he began attending religious services in Nice during the mid '20s - the period of his return to the Russian Orthodox church. We can be thankful that he left us the answer to this question in the Symphony of Psalms and his other great works of religious devotion. The Psalms are, to Stravinsky's mind, poems of exaltation, but also of anger, judgement, and curses. Stravinsky conveyed the spirit of the poems masterfully through the highly contrasting tempi and harmonies.

I. Hear my prayer, O lord
This is the shortest of three movements and also, in my opinion, the easiest on the chorus. It was composed in a "state of religious and musical ebullience." Some of the notes in this section are just above the top of my range and so I and my tenor colleagues will often sing in a falsetto (false) voice. This is just a tricky way of working with your voice to produce notes outside your normal range. A male falsetto has much of the tonal quality of a female soprano voice - a very pure piercing sound. You can hear it when we sing the phrase remitte, remitte mihi (O spare me...) toward the end of the movement. Actually, truth be told, we do have some altos singing along with us there. (One of the difficulties with amateur choruses is that male voices are often in short supply and creative uses of available resources is necessary.)

You may, if you speak any Latin, notice that the final word of the first movement, ero, is split by the chorus taking a breath between the first syllable and the second. This is deliberate. in fact the last word of each movement is similarly split.

Stravinsky took a lot of flack for this practice. He answered his critics by saying that "in setting the words of this... hymn, I cared above all for the sounds of the syllables.... I really do tire of people pointing out that Dominum is one word and that its meaning is obscured by the way I respirate it.... Do such people know nothing about word-splitting in early polyphonic music?"

II. I waited patiently for the Lord
Patiently the psalmist waits for a new song of praise from the Lord. This movement is a fugue. A fugue is sort of like a round and, if you've ever sung Row, Row, Row Your Boat or Frere Jacques, you know what a round is. The first voice enters singing the melody or theme. After the first phrase has been completed the second voice enters with the same theme while the first voice continues presenting new material. Subsequent voices enter with the theme after the same pause. A fugue is somewhat more complicated. The first voice enters with the subject, a short melody, which is 'answered' by the second voice. The answer consists of the subject transposed up a fifth or, as here, down a fourth. In this piece Stravinsky also modifies the subject slightly for each voice. Still, if you listen carefully you will hear the phrase Expectans expectavi Dominum sung by the sopranos and then answered by the altos. The tenors then sing the subject, and the basses answer in the same phrase sung by the altos. I should point out here that the movement begins with another, completely different, subject introduced by the instrumental parts. Complex and difficult but also quite lovely.

Once again, the final word of the movement, Domino, is split by a breath between the first and second syllables.

III. Alleluia, Praise ye the Lord
This is the new canticle of praise which the Lord has put into the mouth of the psalmist in answer to his prayer in the second movement. Stravinsky began work on the Symphony of Psalms with the fast sections of this movement. He then went back and composed the first and second movements before returning to complete his work on this psalm. Once again his devil-may-care attitude is instructive: "...at first, and until I understood that God must not be praised in fast, forte, music no matter how often the text specifies 'loud,' I thought of the final hymn in a too-rapid pulsation." Instead the third movement begins with a slow and stately Alleluia which is followed by an exciting and stirring Laudate Dominum "inspired by a vision of Elijah's chariot climbing the Heavens." The triplets (three notes in a single beat) in this section suggest the horses galloping into the sky.

Singing Laudate is quite tricky. The au sound is a diphthong (two 'pure' vowel sounds articulated one after the other). In this case ah is quickly followed by oo - so quickly that you might not notice it while speaking. It's just a single syllable, that's all. Unfortunately you can't sing it the way you speak it. Suppose you have a half note to sing Lau followed by a whole note to sing da followed by a half note to sing te (as happens in this section). If you slide between the ah and the oo too quickly (say a quarter note each) the effect is unlovely. Instead you must hold the ah for the duration of the half note and then quickly hit the oo before beginning the da. Music is an illusion.

A confession: I made up the part about the photograph of Stravinsky riding his motorcycle through Paris. It's just such a lovely image that I couldn't resist sharing it with you. I hope you enjoy tonight's concert despite my deception.

Sources: Dialogues and a Diary, Igor Stravinsky and Robert Craft, Doubleday and Co., Inc., 1963

Scattered Thoughts