1.4 A Bit More OCaml Syntax

A Bit More OCaml Syntax

Though this guide may leave out an extended explanation of some syntax and features that will/have been used, I hope to have included at least most of what is important. The following are some notes on additional parts of OCaml's syntax, such as records, references, user-defined structs etc.

Library Functions

CS51 uses the Core Library, which is not meant to supplement the standard OCaml library. How do we open the Core library and use it?

If you've set up your OCaml environment as I have, via the commands found here, then your .ocamlinit file should be set up. (.ocamlinit contains a series of commands that are run when we run ocaml in the terminal). In this case, we just need to open the library at the top of any of the .ml files in order to use the library. (We are opening the Std package which is in the Core library).

open Core.Std;;

In the case that we have not set up our .ocamlinit file, we should go ahead and do that or make sure it is correct if we are getting an error. However, if you do not want to do this, we should require certain things at the top of our .ml files to open the library.

require "core";;
open Core.Std;;

Opening these libraries allows us access to a number of different modules that have been written by smart people and we should take advantage of that. For instance, the List module in the Core library contains a number of List functions, including fold and a number of variations, as does the List module in the standard library. Since we are using the Std library which is part of the Core library, there are a different set of List functions found here. To call one of these functions from a module, we simply use the name of the module and append the function name with a period in between. For instance:

List.map (fun a -> a + 1) [1;2;3;4;5];;

This says that we are calling the map function from the List module. We can also call List.iter, List.fold_left, etc. this way

Modules

If the word module seems a bit unfamiliar, that's fine. Every function in OCaml is in a module, even the ones we write.

For instance, let's open a new file and save it as module_one.ml. Let's write a function in it that prints hello to the console every time we call it.
(As a note, we declare that the argument print_hello takes is the unit (), so that we can evaluate it is later rather than it evaluating immediately. We'll touch on lazy evaluation later, which is a related concept, but this is a digression.)

let print_hello () = Printf.printf "Hello!\n";;

Great. Saving this automatically creates a module called Module_one that contains the function print_hello. We can now call this module from another file. Let's now open a new file and save it as caller.ml.

Module_one.print_hello ();;
Module_one.print_hello ();;
Module_one.print_hello ();;

If we want to compile and run this:

ocamlopt -c module_one.ml
ocamlopt -c caller.ml
ocamlopt -o hello module_one.cmx caller.cmx

hello

Hello!
Hello!
Hello!

Awesome! We created a module and used it in another file. If we want to avoid having to type what module the function comes from each time, we can open the module, like so in caller.ml

open Module_one;;

print_hello ();;
print_hello ();;
print_hello ();;

So now we see the relationship between the List module above and the module we just created. We don't typically open the List module, because the names of the functions we typically call are common and often overlap with other functions that we want to call. However, if we want some functions from the Printf module, we can open it with open Printf, since those function names are typically unique.

For more on modules and related issues like functors, here is a reference.

User Defined Types

We can define our own types in OCaml, meaning that not everything has to be in list form. This is pretty convenient. There are two basic types of user defined types, Sum and Product. The latter is commonly referred to as records, which we'll cover first.

Records

Records are similar to the tuples that we saw earlier with the key difference being that each of the fields are now labeled with a name and accessible by those names. Let's take a look first at the general syntax for declaring them using the type keyword.

type <record-name> = { <field> : <type> ; <field> : <type> ; ... };;

Here is an example of declaring a record that stores the value of a point in an xyz plane.

type xyzvalue = {x:float; y:float; z:float};;

type xyzvalue = { x : float; y : float; z : float; }

Now we can use this type and store some data about some data points.

let point1 = {x = 4.1; y = 2.2; z = 1.1};;

val point1 : xyzvalue = {x = 4.1; y = 2.2; z = 1.1}

As we can see, OCaml has already inferred that the type of the value we just declared has type xyzvalue.

So we've seen that we can declare these records, now let's see how we can access individual parts of them. We can use the dot notation

let get_x_value : xyzvalue -> float = fun xyz ->
xyz.x
;;

get_x_value point1;;

- : float = 4.1

We can also pattern match on these records.

let add_xyz : xyzvalue -> float = fun {x = x_val; y = y_val; z = z_val} ->
x_val +. y_val +. z_val
;;

add_xyz point1;;

- : float = 7.4

So records are tuples that have names. This is pretty cool. Let's talk about Sum types next.

For more information on records, look here and here.

Algebraic Datatypes

So product types are nice because they are named, but OCaml lets us define our own types, not just name fields of tuples. So the option besides records are algebraic datatypes or the Sum type. To give an example, let's define a type for the day of week. Once again, we use the type keyword:

type day = Sun | Mon | Tue | Wed | Thu | Fri | Sat;;

type day = Sun | Mon | Tue | Wed | Thu | Fri | Sat

In this example, a value of type day will be one of these 7 things.

let day1 = Sun;;

val day1 : day = Sun

Once again, OCaml does a type inference for us. Let's see how we would use this in a function. (Hmm, do we see some resemblance of the use of the |?)

let int_of_day : day -> int = fun a ->
match a with
| Sun -> 1
| Mon -> 2
| Tue -> 3
| Wed -> 4
| Thu -> 5
| Fri -> 6
| Sat -> 7
;;

int_of_day day1;;

- : int = 1

So we can match on these types, which is important, because matching is awesome, as we've covered.

As we've seen, this is an "or-ing" of types together. A day can either be Sun or Mon or Thu but not a combination at once. This would make sense. So we can do a similar or-ing together of various other types using the of keyword. For instance, if we want a set of playing cards:

type playing_card = Spades of int | Clubs of int | Diamonds of int | Hearts of int;;

type playing_card =
Spades of int
  | Clubs of int
  | Diamonds of int
  | Hearts of int

let card1 = Spades 4;;

val card1 : playing_card = Spades 4

let you_win : playing_card -> bool = fun a ->
match a with
| Spades 7 -> true
| _ -> false
;;

you_win card1;;

- : bool = false

Note that when we use this type, we don't include the of keyword. We only use the of keyword when declaring the type. We can define lots of types with lots of ways, such as using tuples:

type xyzpoint = Point of int * int * int;;

type xyzpoint = Point of int * int * int

let point2 = Point (1,2,3);;

val point2 : xyzpoint = Point (1, 2, 3)

Defining our own types gives us some powerful tools. We can even define some recursive data structures like linked lists and trees. We'll see more examples of this later when we delve into data structures.

For more information on both Sum and Product types of User Defined Types (as well as the explanation as to their names), check out this handy explanation. We'll encounter the concept of monads later, which is a related concept that'll take advantage of user defined types.

References

While we've been covering alot of pure functional programming, mutable/state changing functions are totally possible in OCaml using references. They are the pointers of OCaml. Arrays can also change state, but we won't dive into arrays or mutable fields which are detailed here and here. We just briefly cover references here. We use the ref keyword to define a reference. We use the bang (!) to dereference a reference.

let example_ref = ref 3;;

val example_ref : int ref = {contents = 3}

We can change the value in a reference. This operation has a return type of unit, since there is no return value. We just change the state

example_ref := 4;;

- : unit = ()

As we can see from the following, we have changed the value stared in the reference.

Printf.printf "%d\n" !example_ref;;

4

We don't dive much into mutable state, not only because its not very widely used, but also because we focus on functional programming in the meantime. There are a ton of additional resources on mutable state in OCaml. I've listed a couple: here, here, here, here, and here.

Let Scoping

Cool, we've kind of seen that we can use let to define some local variables in functions. For instance we can let bind some values in the declaration of a function.

let add4 : int -> int = fun a ->
let x = 4 in
a + x
;;

let add4 : int -> int = fun a ->
let x = 2 in
let y = 2 + x in
a + y
;;

let add4 : int -> int = fun a ->
let x = 1 in
let y = x * 2 in
let z = y * 2 in
a + z
;;

We can define local functions as well

let list_add4 : int list -> int list = fun lst ->
let add4 = fun a ->
let x = 1 in
let y = x * 2 in
let z = y * 2 in
a + z
in
List.map add4 lst
;;

We can even use the same variable names. When we have variables of the same name but of different scope, we always use the variable with the smallest scope that encompasses our environment. For instance:

let x = 5;;

let add4 : int -> int = fun a ->
let x = 4 in
a + x
;;

add4 12;;

- : int = 16

12 + x;;

- : int = 17

In this example, add4 uses the x with the smallest scope, namely the one defined in the function. When we call x outside of this function, that x is no longer in scope and we use the global x = 5 instead. This applies to functions as well.

Flags

When we define functions, we can name the arguments we want by using the ~ syntax before the argument name. Let's see an example. (I havn't figured out a way to name things in anonymous functions, but that might be the point).

let add4 ~input =
input + 4
;;

val add4 : input:int -> int = <fun>

To use this function, we do the following:

add4 ~input:8;;

- : int = 12

We see that named arguments are used quite frequently, in part because it makes the code clear when we call library functions and because we no longer have to keep the correct order of arguments as long we name them. For instance, the Core library List.fold_left function needs you to name the arguments for it to work.

Optional Arguments

We can make arguments to functions options immediately by prefixing arguments with ?. This means we also name the arguments.

let welcome ?greeting_opt name =
let greeting =
match greeting_opt with
| Some greeting -> greeting
| None -> "Hi"
in
Printf.printf "%s %s\n" greeting name
;;

val welcome : ?greeting_opt:string -> string -> unit = <fun>

welcome ~greeting_opt:"Hey" "reader" ;;

Hey reader

welcome ?greeting_opt:None "reader"

Hi reader

welcome "Reader";;

Hi Reader

For these optional arguments, we can provide default values, in which case we longer have to filter out the case for None, which is done for us.

let welcome ?(greeting="Hi") name =
Printf.printf "%s %s\n" greeting name

welcome "reader"

Hi reader

This last series of examples is taken from Mads Hartmann's OCaml Briefly, which has some more detail on OCaml syntax. There are plenty of additional places to get more familiar with OCaml, but completes the first section, which tries to touch on many of the key ideas (values, functions, currying, higher order functions, types, lists, recursion, etc) needed to explore cool functional programming ideas. Next up are some example questions designed to either review or introduce some variations on these topics.

Previous: 1.3 Recursion

Next: Section Examples