← back to articles

writing a build system in ocaml

2024-02-08 · 2 min read

i tried to replace make with a small ocaml program. here's the shape of the thing.

make is fine until it is not. the indentation-sensitive syntax, the implicit rules that fire when you least expect them, the global variable namespace — these are paper cuts, not showstoppers, but they accumulate. i wanted to see what a build system would look like if i wrote it in a language that took static analysis seriously.

the core

the build system models a dependency graph where each node is a task with inputs, outputs, and a command. the graph is constructed lazily — only the tasks needed to produce the requested targets are evaluated. ocaml’s module system makes it natural to express this as a functor parameterized over the rule type:

module type RULE = sig
  type t
  val name : t -> string
  val inputs : t -> string list
  val outputs : t -> string list
  val run : t -> (unit, string) result
end

module Make(R : RULE) = struct
  module G = Map.Make(String)

  type graph = {
    rules : R.t G.t;
    mutable built : G.t;
  }

  let build graph target =
    match G.find_opt target graph.rules with
    | None -> Error ("no rule for " ^ target)
    | Some rule ->
      let deps = List.iter (build graph) (R.inputs rule) in
      if not (G.mem target graph.built) then begin
        R.run rule |> Result.map (fun () ->
          graph.built <- G.add target rule graph.built)
      end else Ok ()
end

the RULE module type defines the contract for a build step. the Make functor produces a build engine that works with any concrete rule set.

what surprised me

ocaml’s polymorphic variants are a natural fit for expressing build errors. instead of defining a monolithic error type, i used polymorphic variants with pattern matching to let each rule define its own failure modes while keeping the engine agnostic to them.

the runtime performance surprised me in the other direction. building the dependency graph for a large project is a graph operation on thousands of nodes. ocaml handled this comfortably — the traversal was never the bottleneck, even on the first run with a cold cache.

what i would cut

the incremental rebuild logic was the hardest part to get right and the least valuable. file-change detection, dependency invalidation, parallel execution — each of these is a deep rabbit hole with well-established solutions in existing tools. a hobby build system does not need to solve them better than ninja.

by the end, the design converged to something close to just: a single-threaded runner that rebuilds everything from scratch unless told otherwise. simplicity wins when the alternative is a two-thousand-line change tracker that still misses edge cases.

conclusion

writing a build system taught me more about ocaml’s module system than any tutorial could. the final tool is not better than make for real projects, but it is mine, and i understand every line of it. that is a different kind of value.