1.2 Pattern Matching

Pattern MAtching

Pattern matching in OCaml is an extremely powerful part of the programming language and, as a result of its essentialness, it appears frequently and with varying syntax. I try to be as complete as possible when covering varying syntactical details but there are sure to be things that I have left out. For the most part, what is covered here is what will be used going forward.

After figuring out exactly what pattern matching is and what it does, I would say that it is aptly named. Pattern matching is a type of program control based around matching "variables" to particular "patterns". For instance, let’s say the input to a function is an integer; we can match the argument to our function to see whether this integer is 1, 7, 42, or anything else, and then execute different code based on which pattern matched. Or let’s say that the variable that we want to match on is a list; we can match on this variable to see if the list has exactly 2 elements, at least 1 element, or is an empty list. We are matching these variables to “patterns” (The integer 7, the empty list, etc.)

I liken matching to a more powerful if/else chain of equalities. We will see this in the first example that follows. The following examples are very trivial and are only here as examples of the syntax and flow of matching.

Example 1:

We want a simple function that outputs true if its argument is the number 42 and false otherwise. (Recall that this function has the type int -> bool.) Let’s first write it in C, to give an idea of what it would look like in language without matching.

_Bool guess(int x)
{
if (x == 42)
return 1;
else
return 0;
}

Now let’s see the same function in OCaml. We could use conditionals, but we are going to use match statements.

Let’s break it down. The match statement takes arguments (we’ve passed in x, our argument) and then branches into a couple of possibilities. The first possibility listed is 42. If x matches 42, we take the arrow and output true. If x does not match with 42, we go to the next branch. The second possibility listed is _, the catch all, wildcard symbol. It will match with anything appropriate. In our function, if x does not match with 42, it will match with the catch all and our function will return false.

let guess : (int -> bool) = fun x ->
match x with
| 42 -> true
| _ -> false
;;

The top down logic of the match statement is important when working with the catch all. Had we written guess like this,

We would never have gotten to the second statement and our function would always output false.

let guess : (int -> bool) = fun x ->
match x with
| _ -> false
| 42 -> true
;;

Time for a little detour into some mucky OCaml details. I have been using the fun keyword to define a function and following it by listing the arguments the function takes. There is the function keyword in OCaml. The function keyword takes one argument and immediately matches on it, which means we could rewrite our guess function like this with the function keyword.

More information on the difference and use of the fun and function keywords can be found here thanks to our friends at Stack Overflow.

let guess : (int -> bool) = function
| 42 -> true
| _ -> false
;;

Awesome! We've matched on one variable, but matching is much more flexible than that. We can match multiple variables! Let's check out the second example.

Example 2:

Let’s define guess to take 2 arguments. We will output true if either argument is 42. First let’s see an example of how one might write a C program to do this.

_Bool guess(int x, int y)
{
if (x == 42 || y == 42)
return 1;
else
return false;
}

Now let's see it in OCaml. Once again, avoiding the use of conditionals.

Breaking it down, we match on the two arguments, x and y. We use commas to separate them. In the first branch, we have a catch all for x so it doesn't matter what x is, and we match y with 42. This will catch all instances of y being 42 and x being anything. The second branch attempts to match x with 42 and y with anything. The final branch will catch any other combination, as it catches any x and any y.

let guess2 : (int -> int -> bool) = fun x y ->
match x, y with
| _, 42 -> true
| 42, _ -> true
| _, _ -> false
;;

And matching extends beyond simply matching integer arguments to particular integers. Let’s take a look at lists in OCaml in our next example.

Example 3:

We want to take a list and output the head of the list (the first value) if the list is nonempty. If it is empty, we just want to output None. Defining the type of this function is a bit trickier than before.

let head : (‘a list -> ‘a option) = fun lst ->
match lst with
| hd :: _ -> Some hd
| [] -> None
;;

Let’s take a look at how simple this was. We simply matched on our list (lst). The first branch says, if our list (lst) matches the following pattern, a value (hd) attached to anything ( _ ), simply output the value (Some hd). We've basically broken apart our list using a match statement, defined a new variable (hd) and assigned it the value of the head of the list! In the other case, our list must be the empty list ([]), the second branch, in which case we can output None. Recall from above that we can match on multiple lists, a combination of lists and other variables, etc.

Let’s take a look at matching on a duple.

Example 4:

Let’s write the function add that will take a duple and simply add together the two values in it.

let add : (int * int -> int) = fun x ->
match x with
| (y, z) -> y + z
;;

That was simple, but we can make it even simpler, by matching in the argument definition like so:

let add : (int * int -> int) = fun (x, y) ->
x + y
;;

So we can match the input immediately in the declaration of the function. What else can we match? Let’s match options.

Example 5:

Let deoptionalize take an int option, and take off the annoying Some. If we get a None argument, we simply return 0 (this is inelegant, which is why we have the option feature in the first place^^).

let deoptionalize : (int option -> int) = fun x ->
match x with
| Some value -> value
| None -> 0
;;

Example 6:

We can match on the returned value of a function. Recall from before how guess is defined.

In this case, we’ve matched on the value of guess x.

let new_guess : (int -> int) = fun x ->
match guess x with
| true -> 1
| false -> 0
;;

Example 7:

Let’s a look at some more lists.

In this case, we don’t really care what the values of the first two things are, we simply care that they exist.

let at_least_2_in_list : (‘a list -> bool) = fun lst ->
match lst with
| _ :: _ :: _ -> true
| _ -> false
;;

Example 8:

We can add the first two values in an int list

let add_first_two : (int list -> int option) = fun lst ->
match lst with
| frst :: scnd :: _ -> Some frst + scnd
| _ -> None
;;

let add_first_two : (int list -> int option) = fun lst ->
match at_least_2_in_list lst, lst with
| true, frst :: sncd :: _ -> Some frst + sncd
| _, _ -> None
;;

Matching is crazily powerful.

The OCaml compiler will even tell you when you forget to match a particular case, inn order to be error free, and it will tell you when you don’t need to define a variable that you match and instead can use the catch all. Take a look.

OCaml will tell you that you’ve missed some possibilities. We can still do example_mistake 42, but doing example_mistake 12 will give us an error.

let example_mistake : (int -> bool) = fun x ->
match x with
| 42 -> true
| 1 -> false
;;

Warning: This pattern-matching is not exhaustive. Here is an example of a value that is not matched: 0

These are simple examples of some powerful qualities of pattern matching in a functional language. OCaml does pattern matching slightly differently than other functional languages, but individual differences aside, pattern matching is awesome. We'll use it in the next section implement recursion, the loops of functional programming.

To aim for completeness, I’ve included some more ways to write the match statement, such as omitting the | from the first match case. I include it because it looks nice.

This example demonstrates that we do not need the branch character for the first branch in a match statement. It also demonstrates syntax that doesn't include the function type or the assigning of an anonymous function, as I do in the examples above.

let example i =
match i with
1 -> "one"
| 2 -> "two"
| 3 -> "three"
| x -> “else”
;;

Previous: 1.1 Conditionals/Program Control

Next: 1.3 Recursion