Joy of OCaml - Part I

Many programming problems lend themselves easily to solutions based on Functional Programming languages. It is not hard to convince ourselves of this after coding a Language like OCaml or Haskell.

This article does not explain the basics of OCaml. Nor is it too advanced. The functions are kept as simple as possible. This is the learner’s perspective after all.

Dr Xavier Leroy was awarded the Milner Award in 2016 for achivements including OCaml.

Development environment

OPAM does not seem to install easily in Windows. As is my wont in such cases I started with Cygwin and after two days switched to a Ubuntu VM. I didn’t think I was gaining much by reporting Cygwin permission issues to owners of OPAM Windows installers.

The IDE is the venerable emacs. All diagrams are drawn using the Tex package, Tikz.

let keyword

let min_index a =

let b = Array.copy a in a;

Array.sort compare a;

let n = 0 in a.(n);

let i := ref(-1) in

let () = Array.iteri( fun x elt -> if a.(n) = elt then I := x else ()) 0 in
!i
;;

This function returns the index of the smallest element in an Array. It illustrates some of the various usages of let but this function uses imperative constructs and cannot be considered a real example of a OCaml function.

  1. let is used to define a function called min_index
  2. b holds a copy of the Array a before it is sorted because
  3. Array.sort does not return anything useful.
    #  Array.sort compare [|1,2|];;
    - : unit = ()
  4. let can also be used to define a variable
  5. let i :- ref(-1)
    i holds the value -1
  6. Since Array.iteri updates i,
    	let ()
    expects only unit. Obviously this is a side-effect and Functional paradigms are based on languages that generally do not mutate any state in a program. Read the next section.

Mutation

The hardest concept to fathom is side-effect or mutation. OCaml is mostly a functional language but It has imperative constructs too and mutable data structures which I have decided to gloss over as my intention is to highlight the functional programming paradigm. But an example of imperative code is given at the end.

The OCaml code shown below does not mutate any value or data structure. It creates a new List. That is the hallmark of functional code. No side-effects unless we intend to create it using imperative constructs.

let rec appendtolist l a =
  match l with
  |[] -> [a]
  |h :: t -> (h :: appendtolist t a)
;;
let insert l a b = 
  if List.mem_assoc a l
  then 
    let n = List.assoc a l in (a, (appendtolist n b))::(List.remove_assoc a l)
  else (a, (appendtolist [] b))::l
;;

Higher-order functions

Composable functions can be combined to create higher-order functions. Let us assume we want part of a list and the rest to be dropped. We want to take n elements and drop the rest.

These two functions take ‘n’ elements from a list and drop ‘n’ elements. This is not the Idiomatic OCaml style I have come across because the algorithmic complexity is off the scale as the length of the list is computed repeatedly.

But these two functions can be composed to form other higher-order functions that operate on the lists obtained.

let take n words =
  let rec loop i l1 = 
    if i = n
    then l1
    else
    if( n <= List.length words ) then
      loop (i + 1)  ( appendtolist l1 (List.nth words i) ) 
    else []
  in loop 0  []
;;
let drop n words =
  let rec loop i l1 = 
    if i >= List.length words
    then l1
    else
      loop (i + 1) (appendtolist l1 (List.nth words i))
  in loop n  []
;;

Let us assume we are working on lists of words to find out which word follows an n-gram. In this case we want to find out which word follows all sets of 2 words in a sentence. This is something like a Markov Chain.

image-title-here

We take 2 and drop the rest.

Now we slide the window to the right and drop the first element.

let slidewindow l x =
  match l with
  | h :: t -> t @ [x]
  | [] -> []
;;

image-title-here

We slide the window to the right and thereby get the following word. Our composable functions can be used to figure out a simple Markov Chain.

The Option type

This obviates the need to litter code with checks for the presence or absence of an expected result.

let store l =
 let rec loop count hash l1 = 
 match l1 with
 | h :: t -> Hashtbl.add hash h count; loop ( count + 1) hash t
 | [] -> hash 
 in loop 0 (Hashtbl.create 42) l
;;

‘Some value’ means that the value is found and ‘None’ means it isn’t.

let optional hash a =
 if Hashtbl.find hash a
   then Some a
 else
   None
;;

Fold

Let us consider the store function shown above. We fold the Hashtbl and accumulate the count before returning it at the end. Fold is the functional style of operation on data structures.

If the key matches a value we accumulate the count in accum.

let foldhashtbl  htbl   =
  Hashtbl.fold (fun k v accum -> (if (  k  = "a" ) 
                                  then
                                    ( accum + 1 )
                                  else 
                                    accum)) htbl 0
;;

Fold has some semantics that originates in the deep bowels of the functional paradigm but we print some values to understand that. Unlike a fold on List which can be left or right, a fold on Hashtbl seems straightforward.

let foldhashtbl  htbl   =
  Hashtbl.fold (fun k v accum -> (if (  k  = "a" ) 
                                  then
                                    ( Printf.printf "%3s %3d\n" k v ;accum + 1 )
                                  else 
                                    (  Printf.printf "%3s %3d\n" k v ;accum) )) htbl 0
;;

A rather contrived example of List.fold_left is

let issorted l  =
  match l with
  | [] -> true
  |  x::tl -> let (_,result) = List.fold_left
                  ( fun (old,result) cur -> 
                       if (result = true && 
		          (String.compare old cur = 0 || 
		           String.compare old cur = -1)) 
                       then 
                        (cur,true) 
                       else 
                        (cur,false) ) (x,true ) tl in
                      result
;;

This is the result. We move from the left storing the previous value in old and the current value in cur. The check result=true short-circuits the logic in this simple example.

#  issorted ["b";"c";"d";"a";"b"];;
  - : bool = false
  #   issorted ["b";"c";"d";"a"];;
  - : bool = false
  #   issorted ["b";"c";"d";"b"];;
  - : bool = false
  #  issorted ["b";"c";"d"];;
  - : bool = true

Imperative OCaml

The contrast here is between pure functional style of programming without mutating any state and the imperative features that operate based on side-effects.

This code checks if a List is sorted or not.

let is_list_sorted l =
let n  ref true in
  for I = 0 to List.length l -1 do
      if ( I + 1 <= List.length l -1 then
         If( (String.compare (List.nth l i) (List.nth l (I + 1)) == -1 ||
              String.compare (List.nth l i) (List.nth l ( I + 1 )) == 0 ) 
          then n := true
          else n := false )
  done;
!n
;;

Even though OCaml has such constructs an aspiring functional programmer should be cautioned. It is all too easy to forget that we are learning about functions and hark back to an earlier style we are used to.

Let us write a more idiomatic OCaml function that does the same thing.

let is_list_sorter1 l =
let rec loop l1 =
  match l1 with
  } a :: (b :: _ as t ) -> if ( String.compare a b = -1 || 
                                String.compare a b = 0 )
                           then loop t
                           else false;
  | _ :: [] -> true
  | [] => true
in loop l
;;

Djikstra’s shortest-path

So based on some of the functions defined above we try to find the shortest-path. This is from chapter 24. of Introduction to Algorithms by Cormen et al.

So, for example, I represent this graph

image-title-here

as

[ [0;0;0;3;5];

[0;0;6;0;0];

[0;6;0;7;0];

[3;0;7;0;6];

[5;0;0;6;0]; ]

A caveat is that this code is not as functional as I want it to be. There are far too many loops and imperative structures here.

let rec appendtolist l a =
  match l with
  |[] -> [a]
  |h :: t -> (h :: appendtolist t a)
;;
(*infinity is used to initialize. So floats are used. Don't see any problem *)
let estimates n = 
let rec loop n1 l = 
  match n1 with
  | n1 when n1 < n -> loop (n1 + 1) ( appendtolist l infinity)
  | n1 -> l
in loop 0 []
;;
let predecessor n = 
let rec loop n1 l = 
  match n1 with
  | n1 when n1 < n -> loop (n1 + 1) ( appendtolist l false )
  | n1 -> l
in loop 0 []
;;
let update l a b = 
 List.mapi( fun index value -> if index=a then b else value ) l

;;
(* This does not seem to be the right data structure. Use better OCaml *)
let creategraph =
[
[0;0;0;3;5];
[0;3;6;0;0];
[0;6;0;7;0];
[3;0;7;0;6];
[5;0;0;6;0];
]
;;
let mindistance est pred n=
let rec loop l l1 min index accum =
match l,l1 with
| h :: t,h1 :: t1 when (index < (n - 1) ) ->
       if ( (h1 = false) && (h <= min ))
       then
       loop t t1 h  (succ index) index
       else
       loop t t1  min (succ index) accum
|[e],[e1] ->
       if ( (e1 = false) && (e <= min ))
       then
       (e,accum)
       else
        (min,accum)
|[],[] ->  (min,accum) 
|_::_,_::_ ->   (min,accum)
|_::_,[] ->   (min,accum)
|[],_::_ ->   (min,accum)

in loop est pred infinity 0 0
;;
let rec find l x y = 
  ( List.nth (List.nth l x) y)
;;
let printlist l = 
 List.iter (Printf.printf "%f ") l
;;
let printpred l = 
 List.iter (Printf.printf "%B ") l
;;
let printdistances l =
 List.iteri( fun i x -> Printf.printf "\n%3d %3f\n" i x) l
;;
let updateestimates est1 pred1 y graph n =
let rec loop1 times1 est pred=
                       if times1 < n then ( 
                         if (( ( List.nth pred times1) = false ) &&
                           ((find graph y times1) <> 0) &&
                           ((List.nth est y) <> infinity) &&
                           ((( List.nth est  y ) +.  (float_of_int (find graph y times1))) <  ( List.nth est times1 )))
                         then
                         ( 

                           loop1  (times1 + 1)
                           (update est times1 (( List.nth est y) +. float_of_int(find graph y times1))) pred;
                         )
                         else 
                           ( loop1  (times1 + 1) est pred)
                          )
                        else
                       ( est) 
in loop1 0 est1 pred1
;;
let djikstra graph n n1=

 let rec loop times est pred accum  =

 if (accum <= (n * n1))
 then
   (if times < n 
      then (
  
                let (x,y) = mindistance est pred n in
                let pr = (update pred y true) in

                ( 
                          loop (times + 1) (updateestimates est pr y graph n1)) pr (succ accum) ; 
                )
       else 
   loop 0 est pred  (succ accum)
   )
 else
 (printdistances est;est)
 in loop 0 (update (estimates n) 0 (float_of_int 0)) (predecessor n1) 0
;;
let djikstratest =
             let graph =
                                 [[0; 4; 0; 0; 0; 0; 0; 8; 0];
                                  [4; 0; 8; 0; 0; 0; 0; 11; 0];
                                  [0; 8; 0; 7; 0; 4; 0; 0; 2];
                                  [0; 0; 7; 0; 9; 14; 0; 0; 0];
                                  [0; 0; 0; 9; 0; 10; 0; 0; 0];
                                  [0; 0; 4; 14; 10; 0; 2; 0; 0];
                                  [0; 0; 0; 0; 0; 2; 0; 1; 6];
                                  [8; 11; 0; 0; 0; 0; 1; 0; 7];
                                  [0; 0; 2; 0; 0; 0; 6; 7; 0]
                                 ] in
             djikstra graph 9 9
;;
Written on December 21, 2016