Segfault Systems

Securing the Foundations of Scalable Systems


Adapting the OCaml Ecosystem for Multicore OCaml | Segfault Systems

Adapting the OCaml Ecosystem for Multicore OCaml

Sudha Parimala
August 25, 2021

Multicore OCaml is an extension of OCaml with support for Shared-Memory Parallelism and Concurrency. OCaml 5.0 aims to ship with support for Shared-Memory Parallelism. Upstreaming Multicore bits to OCaml trunk is already underway, and some major patches, such as the instrumented runtime, GC Colours, Safe points, etc., have been merged. This post aims to take readers through changes necessary for a smooth trasition to Multicore.

As OCaml 5.0 gets closer to reality, there’s increasing interest among library authors to port their libraries to Multicore. OCaml 5.0 will be accompanied by a robust set of libraries, such as domainslib to aid writing efficient parallel programs. Algebraic Effects is expected to land in later versions of OCaml.

Parallelizing OCaml code with Multicore OCaml gives an excellent overview of parallelizing existing OCaml codebases.

Multiprocess vs Multicore

Admittedly due to lack of Multicore support, it’s common practice by OCaml programmers to use Unix.fork primitive to speed up program executions. Firstly, Multicore programs that spawn new domains don’t support Unix.fork primitive, though the primitive remains unaffected in sequential programs. Unix.create_process is a worthy replacement of Unix.fork in Multicore programs.

In a Multiprocess program, each process has its own copy of global data, wheareas in a Multicore program, data are shared between the domains present. We need to be careful while porting Multiprocess programs to Multicore, since it might operate under the assumption that each process will modify only its own copy of data.

Building Your Package with Multicore OCaml

Getting the Multicore compiler is easier than ever! It can be obtained from opam-repository with:

$ opam update
$ opam switch create 4.12.0+domains

Compiler Variants:

Infrastructure for Continuous Monitoring

Opam Health Check

Once deployed, Opam Health Check monitors build and testsuite failures in Opam packages. One instance of Opam Health Check monitors the 4.12.0+domains branch. So far, this has been useful to identify bugs in the compiler and incompatibility with some packages. Opam Health Check can be accessed at: http://check.ocamllabs.io:8082/

Multicore CI

A select set of packages including Lwt, Irmin, Coq, Base, Core, etc., are monitored with more scrutinity. As a part of the OCurrent pipeline, Multicore CI builds all the packages and runs their testsuite for every commit to 4.12.0+domains. This process aims to catch bugs early and maintain compatibility with the wider ecosystem.

Global State and Thread Safety

OCaml allows mutable values, which are widely used in OCaml codebases for efficiency and at times correctness. Concurrent modification of mutable variables leads to non-deterministic results. Hence, care is needed to ensure all concurrent modifications of mutable variables do what is desired.

Fuzzing tools will be available soon to detect data-races. Refer to the Parafuzz talk for more details.

What Multicore OCaml Offers

Standad Library Thread Safety

Standard library isn’t thread-safe by default. Mutable interfaces such as Stack, Queue, etc., still don’t incur overheads on sequential code. They can be wrapped with a Mutex or replaced by their Concurrent counterpart for use in Multicore programs. The Multicore OCaml memory model guarantees memory safety even for programs with data races.

Observationally pure interfaces are preserved. Random, Hashtbl, and Format use mutable data internally, though they are not exposed to the user. Such modules are still safe to be used in Multicore programs, as their global state is replaced with domain-local state.

Systhreads in Multicore

Multicore implementation of systhreads is backwards compatible. Hopefully no surprises there!

Sequential programs with systhreads are expected work like they do in stock OCaml. As for Parallel programs with systhreads:

Runtime Changes

During the development of the Multicore garbage collector (GC), some breaking changes were introduced in the runtime. This would affect code explicitly accessing GC internals. Changes in the runtime are discussed below:

Lazy Values

Lazy values can be forced from only one domain at a time. Concurrent forcing of Lazy values will raise RacyLazy exception.

let f n =
	let r = ref 0 in
    for _ = 1 to n do
      incr r
    done;
    !r

let l = lazy (f 1_000_000_000)
let domains = Array.init 4
	(fun _ -> Domain.spawn (fun () -> Lazy.force l))

let v =  Array.map Domain.join domains

This results in

$ ./lazy_example.exe
Fatal error: exception CamlinternalLazy.RacyLazy

To circumvent this, we can useLazy.try_force which returns option valueNone when some other domain is forcing the given Lazy value.

let f n =
	let r = ref 0 in
    for _ = 1 to n do
      incr r
    done;
    !r

let l = lazy (f 1_000_000_000)

let rec try_force l =
  match Lazy.try_force l with
  | Some v -> v
  | None -> try_force l

let domains = Array.init 4
	(fun _ -> Domain.spawn (fun () -> try_force l))

let v =  Array.map Domain.join domains
let () = print_int v.(0)

Naked Pointers

Pointers outside the OCaml heap, also known as naked pointers, which were deprecated in OCaml 4.11, so they won’t be supported in OCaml 5.0. Pointers outside the OCaml heap are typically blocks allocated directly through malloc. The naked pointers checker tool is developed to detect naked pointers. To obtain the checker, run:

$ opam update
$ opam switch create 4.12.1+trunk
$ opam install ocaml-option-nnpchecker

Once the nnpchecker is installed, the checker runs during program execution and reports the address out of heap pointers, if present. frama-c contains naked pointers; the nnpchecker report is below. To find the pointer’s source, watchpoint can be set at the given address on Mozilla rr, which will show the location of that naked pointer.

$ rr frama-c
rr: Saving execution to trace directory `/home/kc/.local/share/rr/frama-c-5'.
Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
$ rr replay
(rr) watch *(value*)0x55fc1e2754d8
Hardware watchpoint 1: *(value*)0x55fc1e2754d8
(rr) c
Continuing.

Hardware watchpoint 1: *(value*)0x55fc1e2754d8

Old value = 1
New value = 94541327240384
0x000055fc1dab48f8 in camlUnmarshal__entry () at src/libraries/datatype/unmarshal.ml:72
72      src/libraries/datatype/unmarshal.ml: No such file or directory.

It’s recommended to wrap the naked pointers with an Abstract_tag or Custom_tag. More ways to tackle naked pointers can be found here in the manual.

Domain State

Runtime global variables are stored in a dedicated structure known as Domain State. This is already present in the latest releases of stock OCaml. Some of the variables in Domain State are exposed in the compatibiltiy module. With Multicore, it’s likely that the compatibility module will be deprecated in OCaml 5.0. Also, the variables in Domain State are specific to a domain, hence they can no longer be treated as global variables.

We recommend evaluating the global variables, and if the code still requires them, they could be referenced from Caml_state, which refers to the calling domain’s Domain State.

GC Colours

Historically OCaml had the GC colours Black, White, Blue, and Grey. Since OCaml 4.12, the Grey colour has been replaced by Multicore’s mark stack. Multicore OCaml’s GC colours are MARKED, UNMARKED, and GARBAGE.

It’s clear that directly referring to historical GC colours will not work with Multicore. The GC colours’ API is not meant to be stable anyway, so our best bet is to replace it with a more stable one. In inevitable cases, the Multicore equivalent can be used.

Parallelizing Lwt

Lwt is a Monadic concurrency libaray widely used in the OCaml ecosystem. Lwt has proven to be efficient for I/O parallelism. With Multicore, we attempt to introduce CPU parallelism in Lwt. This is elaborated in this blog post and PR.


Hopefully, this helped you in a smooth transition to Multicore. For any specific incompatibilites, don’t hesitate to contact us via the issue tracker.