OCAML Basics
In this chapter, we'll simply be covering some of the basic syntax in OCaml. The goal of this chapter is, by the end, to feel comfortable with OCaml syntax and simple OCaml expressions and data types and be able to solve simple problems.
We are going to start off with the quintessential Hello World program, as a quick means of doing an overview of the process of creating, writing, compiling, and executing an OCaml program.
Next we'll take a look at what an OCaml expression is. All programs in OCaml are made up of expressions.
After, we'll look at some primitive types, like ints, strings, and floats. And we'll take a look at some more interesting types like those that you can declare yourself.
And then we'll look at conditionals and various ways to control our programs.
Finally, there will be some examples demonstrating some simple problems that can be easily solved with the tools in this chapter. Along the way we'll cover a bunch of related or necessary topics that aren't worth naming here.
We are going to start off with the quintessential Hello World program, as a quick means of doing an overview of the process of creating, writing, compiling, and executing an OCaml program.
Next we'll take a look at what an OCaml expression is. All programs in OCaml are made up of expressions.
After, we'll look at some primitive types, like ints, strings, and floats. And we'll take a look at some more interesting types like those that you can declare yourself.
And then we'll look at conditionals and various ways to control our programs.
Finally, there will be some examples demonstrating some simple problems that can be easily solved with the tools in this chapter. Along the way we'll cover a bunch of related or necessary topics that aren't worth naming here.
emacs hello.ml
Printf.printf "Hello world!" |
To compile and execute our program, we can type the following to: first compile our code into an executable (the -o flag says to give us an output called hello). The next line runs the executable hello, which gives us an output of Hello world!
ocamlc -o hello hello.ml
hello
Hello World!
Awesome! You know how to program in OCaml! Done! Simple!
Now the following program consisted on one line/expression which prints a string to the terminal. To have a program that consists of more than one expression, we need to separate expressions. For instance, we can't simply separate printf statements with newlines. For instance, the following code generates lots of errors.
Now the following program consisted on one line/expression which prints a string to the terminal. To have a program that consists of more than one expression, we need to separate expressions. For instance, we can't simply separate printf statements with newlines. For instance, the following code generates lots of errors.
Printf.printf "Hello world!"
Printf.printf "Hello world!"
ocamlc -o hello2 hello2.ml
Error: This function is applied to too many arguments;
maybe you forgot a `;'
So we need to separate the expressions. To do this in C, we would separate functions with a single semicolon (;). In OCaml we do this with a double semicolon (;;). We use the double semicolon because in old parsers, all tokens needed to be distinct and the single semicolon, as we'll see later, is used to separate elements in lists and to separate the evaluation of local functions. Thus the following code will compile and run.
Printf.printf "Hello world!";;
Printf.printf "Hello world!";;
Rather than having to compile and run the function in the terminal after each change, we can use a REPL to test our code faster. In emacs (based on the first set up), we can run the REPL in emacs by using the shortcut C-c C-s (Cntr+c Cntrl+s). To evaluate individual lines of OCaml, we can press C-x C-e (C-c C-e also works on my set up for some reason *shrug). To evaluate the entire file (or at least a buffer), we can use C-c C-b.
When we evaluate the first printf expression, our REPL will output something along the following lines.
When we evaluate the first printf expression, our REPL will output something along the following lines.
- : unit = ()
Unit is a type in OCaml that essentially indicates that nothing was returned by the function. The void type of OCaml.
Let's dive into more types in OCaml.
Let's dive into more types in OCaml.
Data Types
Just as there are a bunch of different data types in C (int, char, long, long long, etc.), there are a bunch of different data types in OCaml. Let's cover some of the basics.
int: integers are what we expect--whole numbers. OCaml natively appears to handle 31 and 63 bit integers. Let's try running the following through the REPL.
int: integers are what we expect--whole numbers. OCaml natively appears to handle 31 and 63 bit integers. Let's try running the following through the REPL.
4;;
- : int = 4
So our REPL helpfully deduces the type of the expression (int) and gives us the value of it (4), if possible.
float: floating point values can be written in two ways. Some people recommend that instead of having whole values with trailing decimal points (in the case of 4.) floats be written as the following (4e0). Their argument is that the latter syntax improves readability of code.
4.;;
- : float = 4.
4e0;;
- : float = 4.
4e1;;
- : float = 40.
char: characters are distinct from strings. They are distinguished by using single quotations (') rather than double quotes for strings (").
'a';;
- : char = 'a'
'\n';;
- : char = '\n'
string: double quotes this time.
"This is a string\n";;
- : string = "This is a string\n"
bool: there are two possible boolean values: true and false.
true;;
- : bool = true
unit: The unit type is used for expressions that return no value. OCaml uses this type to both help type check expressions. For instance, the printf statement that we say earlier does not return a particular type, so OCaml uses the unit type to denote this.
();;
- : unit = ()
Building on the Basic Types
Tuples: OCaml allows us to create really large pairings of numbers. For instance, if we want to pair together the number 10, the string "hello", and the bool true, we can do so by grouping the values in parentheses and separating the values with commas.
(10, "hello", true);;
- : int * char * bool = (10, "hello", true)
OCaml has interpreted the type of this tuple to be int * char * bool. Indeed, we can create a wide assortment of tuples. We can create tuples with other tuples (or any other type we'll cover) inside them. Consider the following tuple that contains other tuples:
(("hello", "world"), (10, true), 100);;
- : (string * string) * (int * bool) * int = (("hello", "world"), (10, true), 100)
list: perhaps the most important type in OCaml is the list type. They are arbitrarily long sequences of the same data type. For instance, we can have an int list. This would be a list of any amount of ints. Their syntax are square brackets with values separated by semicolons.
[1;2;3;4;5;6];;
- : int list = [1; 2; 3; 4; 5; 6]
The empty list is simply square brackets with nothing inside. ([])
[];;
- : 'a list = []
There are two interesting parts of the REPLs print. The first interesting part is the 'a type. The 'a type is a wildcard type, meaning that this list could be any type of list (int list, bool list, etc.). We'll see the a' type again when we cover polymorphism, an important part of OCaml and more generally, type theory.
The second interesting part is that all lists must have a type. A list cannot just be a list. A list must be an int list or bool list or string list, etc. This means that we can't mix types in our list. For instance we can't have a list with an int and a string:
The second interesting part is that all lists must have a type. A list cannot just be a list. A list must be an int list or bool list or string list, etc. This means that we can't mix types in our list. For instance we can't have a list with an int and a string:
[1; "error"];;
Error: This expression has type string but an expression was expected of type int
Lists can also be manipulated with special functions which we'll dive more into depth to later. One of the common operators is the cons operator, or double colons (::). This indicates that we are adding values to the front of the list. Consider these two to be equivalent. (The parenthesis indicates which cons is done first).
1::2::[] = 1::(2::[]) = 1::[2] = [1;2]
We can have lists of any type, including other lists.
[[1];[1;2]];;
- : int list list = [[1]; [1; 2]]
array: The array type acts very similarly to the lists of perl. These arrays have O(1) length which lists in OCaml have O(n) length. We will be using lists more often than we will be using arrays. The syntax for defining arrays are specially marked square brackets ([||]). Arrays are mutable, while lists are not, thus arrays are viewed a bit suspiciously.
[|1|];;
- : int array = [|1|]
[[|1|];[|2;3|]];;
- : int array list = [[|1|]; [|2; 3|]]
option: Sometime in OCaml, we'll need to use the option type, which allows us to indicate whether a function will return a value or a lack of a value. We'll cover this a little more later on. Some indicates that there is a value. None indicates otherwise. Options are also bound to specific types.
Some 4;;
- : int option = Some 4
None;;
- : 'a option = None
ref: As typically a functional language, OCaml's types are immutable, meaning that they can't be changed. I covered the immutable aspect of OCaml here. However, references like to break that rule. We create them by using the ref keyword (it's actually a function, but whatever). We can dereference refs with the explanation point (!) keyword (again, also a function, but bear with me).
ref 1;;
- : int ref = {contents = 1}
!(ref 1) + 1;;
- : int = 2
In the second example, we have created a ref to the int 1, dereferenced it to get 1 and added it to another 1 to get 2. We can't add an int ref to an int. We'll cover functions such as adding further down.
Records: Records are the equivalent of structs in C. We need to define them before we can use them. We'll cover the syntax for user defined types later, but here is an awesome reference if you want to check them out before hand.
Objects: We'll cover objects more extensively when we get to Object Oriented Programming (OOP).
Objects: We'll cover objects more extensively when we get to Object Oriented Programming (OOP).
Let
Having seen these very basic types, what can we do with them?
Well, one thing that we commonly do in other languages is to create variables. In OCaml, we can create static variables, ones that don't change. Another way to look at these is that we are simply giving another name to particular values. For instance, if we want to assign the variable x the value of 4, we can do it with the let syntax.
Well, one thing that we commonly do in other languages is to create variables. In OCaml, we can create static variables, ones that don't change. Another way to look at these is that we are simply giving another name to particular values. For instance, if we want to assign the variable x the value of 4, we can do it with the let syntax.
let x = 4;;
val x : int = 4
The REPL has deduced the type of x (int) and indicates that the value assigned to x, an int, is 4.
The following let is assigning a name to a list.
The following let is assigning a name to a list.
let lst = [4;4;4;4];;
val lst : int list = [4; 4; 4; 4]
Free feel to play around with assignments. There is alternative syntax that we can use that helps tell the OCaml compiler exactly what type we want our variables to be. Consider:
let y : int = 4;;
val y : int = 4
let (z : int) = 4;;
val z : int = 4
However, if we indicate the incorrect type in our syntax, the compiler will yell at us. Consider
let (wrong : float) = 4;;
Error: This expression has type int but an expression was expected of type float
The above error has to do with OCaml being strictly typed. Unlike in C, where the compiler will assume that the coder knows what they are doing and will truncate ints or let you assign ints to floats, OCaml does not allow for this (akin to assigning an int to a string). While this appears unintuitive, we'll come to understand the strict typing and develop the logic to use it and we'll soon forget about it.
If we want to reassign a value to a variable, we simply use another let statement and we can overwrite the original let statement.
Finally, let statements are how we define functions. Because functions are considered to be values and can be passed around, it makes sense that the same syntax be used to assign values to variables or functions.
If we want to reassign a value to a variable, we simply use another let statement and we can overwrite the original let statement.
Finally, let statements are how we define functions. Because functions are considered to be values and can be passed around, it makes sense that the same syntax be used to assign values to variables or functions.
Functions
OCaml comes with a bunch of predefined functions and operations on values. For instance, we can use OCaml as a calculator (meaning we can add values together. Let's take a look at how adding works in OCaml.
4 + 4;;
- : int = 8
So the plus (+) infix operator is a function/operation on two ints that returns an int. So let's try and define our own addition function. We can use the fun keyword.
fun a b -> a + b;;
- : int -> int -> int = <fun>
Lots of interesting new details are revealed about OCaml. The first is that we name the arguments to our function by simply listing them after the keyword fun. Here we've named the arguments to the function a and b. We define the operations (+) on these arguments after the -> syntax. Next, the REPL returns us the function type. It says that our function is a function that takes in an int (the first int), another int (the middle int), and outputs an int (the last int).
After defining this function, it seems difficult to call our function. That's because we've defined it anonymously (without a name, more here). We can use the let statement to attach a function with a name. Let's call it add.
After defining this function, it seems difficult to call our function. That's because we've defined it anonymously (without a name, more here). We can use the let statement to attach a function with a name. Let's call it add.
let add = fun a b -> a + b;;
val add : int -> int -> int = <fun>
Now we can use this function. Unfortunately, we can't use our new function like we would use the plus (+) infix operator. We can't do 2 add 2. We have the list the arguments to our function after the function name. For instance:
add 2 2;;
- : int = 4
We can assign a name to the return value of our function using the let keyword.
let result = add 4 5;;
val result : int = 9
Just as we can tell the compiler what type a variable is, we can do the same with functions. For instance, with our add function we can do the following:
let add : int -> int -> int = fun a b -> a + b;;
val add : int -> int -> int = <fun>
Compare this way of defining function to a way of defining anonymous functions in Javascript, shown below. (You can program functionally in Javascript!).
var sum = function (a, b) {return a + b};
While I prefer this method of naming an anonymous function and including the function type, there are a couple, more widely used methods of defining functions. The first way doesn't involve the fun keyword. We simply include the name of the function and immediately follow with the arguments. We also do not use the -> syntax.
let add a b = a + b;;
val add : int -> int -> int = <fun>
We can also assign types to the arguments.
let add (a : int) (b : int) = a + b;;
val add : int -> int -> int = <fun>
Being explicit about types can help readability and can check for errors. So while the compiler deduces the function type automatically, it helps to understand what the function does without relying on the compiler.
Since I mentioned that the add operator (+) is also a function, there are a couple of interesting things we can do with it. We can ask OCaml about it, such as getting its function type by doing the following:
(+);;
- : int -> int -> int = <fun>
This tells us that add is a function that takes two ints and outputs an int.
We can also pass in arguments to (+) by listing them after, like so:
We can also pass in arguments to (+) by listing them after, like so:
(+) 1 2;;
- : int = 3
Strictly Typed
Functions in OCaml are strictly typed, which means that we can't pass anything else into these functions. For instance, when we asked about add's function type, it said that it was int -> int -> int, meaning that it only takes ints. That means we can't use (+) to add floats, because they are a different type. We also can't add a float to an int using (+). We get some major errors.
Functions in OCaml are strictly typed, which means that we can't pass anything else into these functions. For instance, when we asked about add's function type, it said that it was int -> int -> int, meaning that it only takes ints. That means we can't use (+) to add floats, because they are a different type. We also can't add a float to an int using (+). We get some major errors.
10 + 4e0;;
Error: This expression has type float but an expression was expected of type int
4e0 + 4e0;;
Error: This expression has type float but an expression was expected of type int
To add floats, we have to use a function that takes two floats together.
(+.);;
- : float -> float -> float = <fun>
4e0 +. 3e0;;
- : float = 7.
To add an int to a float, we have to convert the int to a float or the float to an int. int_of_float takes a float and converts it to an int.
int_of_float;;
- : float -> int = <fun>
4 + int_of_float 4e0;;
- : int = 8
The above is evaluated like the following (the functions in parentheses are evaluated first)
4 + (int_of_float 4e0);;
- : int = 8
So what did we do? Well because (+) takes two ints, and we have an int and a float, we need to convert the float to an int. We do this with a function that takes a float and outputs an int. In this way, we can use function types to guide us when we need to write them.
Finally, we can use the let keyword to define variables inside functions and expressions. Just as we might have a local variable in C, we can define local variables using let expression in. Let's see an example of this. We'll see more later.
let locally_scoped_variables =
let x = 5 in
let y = 10 in
let z = 15 in
x + y + z
;;
val locally_scoped_variables : int = 30
let example num =
let local_function x =
x + 5
in
num + local_function num
;;
val example : int -> int = <fun>
example 10;;
- : int = 25
Here is Cornell's Introduction to OCaml Syntax. It has a list of common operations on the basic types and has more information on let bindings.
Here are some useful functions that are in the standard library.
First class functions are a defining feature of many functional languages. I've written a bit on first class functions here.
Here are some useful functions that are in the standard library.
First class functions are a defining feature of many functional languages. I've written a bit on first class functions here.