Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

patch-prolog

A standalone Prolog compiler. plgc compiles an ISO-subset Prolog program to a single native binary with zero runtime dependencies — no Rust toolchain, no interpreter, no serialized clause database. Predicates become native code via LLVM.

plgc build rules.pl -o my-linter      # ~676K standalone binary
./my-linter --query "violation(Field, Reason)"
echo $?   # 0 = no solutions (clean), 1 = solutions found

The compiled binary contains no clause interpreter — control flow is generated per predicate as native code, and only primitive services (heap, trail, unification, builtins, query parsing, output) come from a small runtime statically linked in. The result runs anywhere with libc/libm and nothing else.

The toolchain

Three binaries share the engine:

BinaryWhat it is
plgcThe compiler — build, run, and check your Prolog.
plgrAn interactive REPL that drives the compiler (never interprets).
plglA Language Server (diagnostics, completion, hover, go-to-definition).

Where to go next

  • Installation — install the toolchain and compile your first program.
  • Compiler Usageplgc commands, the query wire-contract, and exit codes.
  • REPL Guide, Language Guide, and the Builtin & Stdlib Reference follow as those pages land.

Installation

patch-prolog is built from source. The toolchain is three binaries — plgc (compiler), plgr (REPL), and plgl (language server) — installed together.

Requirements

To…You need
build the toolchainRust (see rust-toolchain.toml) and just
use plgcclang ≥ 15 — it links the generated code. No Rust required.
run a compiled binarynothing — only system libc/libm

That last row is the point: the programs plgc produces are standalone. You hand someone the binary and it runs, with no Prolog system installed.

Build and install

just install

This builds the runtime and compiler, then installs all three binaries (plgc, plgr, plgl) with cargo install. To build in place without installing:

just build                 # produces target/release/plgc

Shell completions for plgc:

plgc completions zsh > ~/.zfunc/_plgc      # or: bash | fish | elvish | powershell

Your first program

Create deps.pl — a small build-dependency graph:

depends_on(app, auth).
depends_on(app, ui).
depends_on(auth, crypto).
depends_on(ui, render).
depends_on(render, crypto).

% needs/2 is the transitive closure of depends_on.
needs(X, Y) :- depends_on(X, Y).
needs(X, Y) :- depends_on(X, Z), needs(Z, Y).

Compile it to a native binary and query it — needs/2 walks the graph transitively:

plgc build deps.pl -o deps
./deps --query "needs(app, X)" --format text
# X = auth
# X = ui
# X = crypto
# X = render
# X = crypto

Or skip the explicit build during development — plgc run compiles to a temporary binary and executes it in one step (it still compiles; it never interprets):

plgc run deps.pl --query "needs(app, X)"

To explore interactively, start the REPL:

plgr
plg> depends_on(app, auth).
  defined.  (1 in session)
plg> depends_on(app, ui).
  defined.  (2 in session)
plg> ?- depends_on(app, X).
  X = auth ;
  X = ui .

(At the ; prompt, press ; for the next solution or any other key to stop.)

From here, see Compiler Usage for the full plgc surface and the query wire-contract.

Compiler Usage

plgc has four commands:

plgc build        Compile .pl sources to a native executable
plgc run          Compile to a temp binary and run it immediately
plgc check        Parse and statically check sources without compiling
plgc completions  Generate shell completion scripts

Run plgc <command> --help for the exact flags.

plgc build

plgc build [OPTIONS] [INPUTS]...

Compiles one or more .pl files (concatenated in order) into a single native binary.

FlagEffect
-o, --output <PATH>Output path (default: the stem of the first input).
--debugBuild at -O0 with debug-friendly output and DWARF.
--keep-irKeep the generated .ll LLVM IR next to the output, for inspection.
--deny-undefinedTreat calls to undefined predicates as errors, not warnings (see below).
--target <TARGET>native (default), wasm32-wasi — a standalone .wasm CLI module (WASM Target) — or worker — a V8-isolate reactor for the edge (WASM Worker).
plgc build rules.pl -o my-linter
plgc build base.pl rules.pl -o app      # inputs concatenated in order
plgc build rules.pl --target wasm32-wasi -o rules.wasm   # needs a wasm-enabled plgc
plgc build rules.pl --target worker      # reactor .wasm + deploy glue (edge)

plgc run

plgc run [OPTIONS] --query <QUERY> [INPUTS]...

Compiles to a temporary binary and runs it immediately — the fast edit-test loop. It compiles every time; it never falls back to an interpreter, so what you test is what you ship.

FlagEffect
--query <GOAL>The goal to solve, e.g. "needs(app, X)". Required.
--limit <N>Maximum number of solutions to report.
--format <json|text>Output format. Default: text (human-readable).
--deny-undefinedAs for build.
plgc run deps.pl --query "needs(app, X)"
plgc run deps.pl --query "needs(app, X)" --limit 1 --format json

plgc check

plgc check [OPTIONS] [INPUTS]...

Parses and statically checks sources without producing a binary — fast feedback for editors and CI. Reports syntax errors and, by default, warns about calls to predicates that are defined nowhere. --deny-undefined turns those warnings into a non-zero exit.

Querying a compiled binary

A binary built by plgc answers queries directly — this is the same contract plgc run and the REPL use under the hood:

./my-binary --query "goal(X)" [--limit N] [--format json|text]
  • --format text prints one variable binding per line (X = bob), the form you read in a terminal.
  • --format json prints a single structured object — and is the binary’s default, because the typical deployment is a program calling the binary and parsing the result:
{"count":2,"exhausted":true,"solutions":[{"X":"bob"},{"X":"ann"}]}

exhausted is false when more solutions exist beyond --limit.

Exit codes

The exit code is the answer for shell pipelines — you often don’t need to read stdout at all:

CodeMeaning
0No solutions (the goal failed).
1One or more solutions found.
2The --query could not be parsed.
3A runtime error (e.g. an uncaught exception, the step limit).
if ./my-linter --query "violation(_, _)" >/dev/null; then
    echo "lint violations found"; exit 1
fi

Undefined predicates

Because patch-prolog compiles the whole program, it knows every defined predicate at build time. A direct call to a predicate that is defined nowhere can never succeed, so build, run, and check warn about it by default (and plgl shows a squiggle). This stays ISO-faithful: such a call still compiles and raises a catchable existence_error if reached.

Pass --deny-undefined to promote the warning to a hard error (no binary, non-zero exit) when you want correctness enforced over leniency.

Script mode

A .pl file with a plgc shebang behaves as its own compiled binary: plgc compiles it to a temp binary and execs it, passing along the arguments after the file. You invoke it with the same --query contract as any compiled binary.

#!/usr/bin/env plgc
main :- write(hello), nl.
chmod +x hello.pl
./hello.pl --query "main"
# hello

WASM Target

plgc can compile a Prolog program to a standalone wasm32-wasi module instead of a native binary. The module preserves the exact --query wire contract, so anything that drives a compiled binary drives the .wasm identically — only the runner changes (a WASI engine instead of the OS).

This is Tier 1: a standalone WASI command that runs on wasmtime, wasmer, Wasmtime-based platforms (Spin, Fermyon), and any WASI runtime. The deployed .wasm needs only a wasm engine — no Rust, no clang, no plgc.

Requirements

The build side needs a wasm-capable plgc plus the Rust toolchain’s LLVM tools (no wasi-sdk required):

rustup target add wasm32-wasip1          # std + bundled wasi-libc
rustup component add llvm-tools-preview   # llc + wasm-ld

Install a wasm-capable plgc (this embeds the wasm runtime archive):

just install-wasm
# equivalently:
#   just build-runtime-wasm
#   cargo install --path crates/compiler --features wasm --force

The default cargo install / just install is unchanged — wasm support is opt-in at build time and adds nothing to a native-only plgc.

To run a module you need a WASI engine with the tail-call proposal — recent wasmtime (≥ ~0.40), recent wasmer, or any current WASI runtime (brew install wasmtime, or see wasmtime.dev). The constant-stack guarantee relies on wasm return_call; an engine without tail-call support rejects the module at instantiation with a clear tail-call error.

Usage

plgc build --target wasm32-wasi deps.pl -o deps.wasm
wasmtime run deps.wasm --query "needs(app, X)" --format json

--query, --limit, --format, the JSON shape, and the exit codes (0 none · 1 solutions · 2 parse error · 3 runtime error) are all identical to a native build. If -o is omitted, the output defaults to the input stem with a .wasm extension.

How it works

The same LLVM IR plgc emits for native is retargeted to wasm32-wasi:

  • llc -mattr=+tail-call lowers the engine’s musttail calls to wasm return_call / return_call_indirect, so Prolog recursion and backtracking run in constant stack exactly as they do natively. If the tail-call feature is ever missing, llc fails at build time — it never silently emits a stack-growing call.
  • wasm-ld links the program object, the wasm32-wasip1 runtime archive, and the target’s self-contained wasi-libc into a WASI command module.

The whole toolchain cost lands on the build side. The artifact you ship is as self-contained as a native binary — it answers queries with only a wasm engine present.

Tuning

The metacall depth bound (PLG_METACALL_DEPTH) and step limit (PLG_MAX_STEPS) work the same under WASI as natively — but WASI does not inherit the host environment, so pass them explicitly to the engine:

wasmtime run --env PLG_MAX_STEPS=100000000 prog.wasm --query "..."

A wasm engine’s default stack is smaller than a native one (~1 MB vs ~8 MB), so for programs that lean on deeply nested runtime-walked metacalls you may want a lower PLG_METACALL_DEPTH. Ordinary recursion and call/1 tail recursion are constant-stack and unaffected (a 1,000,000-deep call/1 runs under a 1 MB wasm stack).

Wasm builds optimize at LLVM -O2 (--debug drops to -O0); the -O3 the native path uses buys little for this IR.

Edge / serverless (Tier 2)

Running inside V8 isolates (Cloudflare Workers and similar) needs a different build than Tier 1 — wasm32-unknown-unknown with no WASI and a host-call entry instead of a CLI. That is Tier 2, available via --target worker: see the WASM Worker guide for usage, the buffer ABI, tuning, and a Cloudflare deployment tutorial. Tier 1 (this page) covers WASI runtimes; Tier 2 covers the edge.

WASM Worker (edge / V8 isolates)

plgc can compile a Prolog program to a reactor wasm module that runs inside a V8 isolate — Cloudflare Workers, workerd, or Node — instead of a process. This is Tier 2: there is no main, no argv, no stdio. The module exports a small buffer ABI and a JS host calls it over linear memory; a query becomes a function call inside a warm, globally-replicated isolate.

This is the scale-to-zero / global edge shape. The compiled artifact is the same no-VM, no-interpreter engine as a native build — small enough to clear the single-digit-MB module limits edge isolates impose, and stateless by design, so it fits the isolate model by construction. For WASI runtimes (wasmtime, Spin, …), use Tier 1 instead.

The answers are byte-identical to a native build, including error text — the reactor runs the same engine over the same JSON wire shape, only the transport differs (a memory buffer instead of stdout).

Requirements

Build side — a wasm-capable plgc plus the Rust LLVM tools (no wasi-sdk):

rustup target add wasm32-unknown-unknown   # the reactor target
rustup component add llvm-tools-preview     # llc + wasm-ld
just install-wasm                           # plgc with both wasm archives embedded

just install-wasm builds and embeds both wasm runtime archives (Tier 1 wasm32-wasip1 and Tier 2 wasm32-unknown-unknown) behind the one wasm feature, so a single install covers both targets. The default cargo install / just install is unchanged — wasm support is opt-in at build time.

Run side — anything with a V8 isolate and wasm tail calls:

  • Cloudflare Workers (deploy with wrangler), or local workerd (npm i -g workerd) for the dev loop, or
  • Node ≥ 20 for scripting/testing the module directly.

The constant-stack guarantee relies on wasm return_call; V8 ships tail calls and the module is rejected at instantiation by any engine that lacks them — never a silent stack overflow.

Usage

plgc build --target worker deps.pl

This emits deps.worker.wasm plus four overrideable scaffolding files next to it:

FileRole
deps.worker.wasmthe reactor module (the real artifact, always regenerated)
reactor.mjsthe buffer-ABI marshalling (runQuery / assertExports)
worker.jsa Cloudflare/workerd fetch handler that calls reactor.mjs
wrangler.tomlCloudflare deploy config
config.capnplocal workerd serve config

The glue files are written only if absent, so a rebuild refreshes the .wasm but never clobbers your edits — delete one to regenerate it. --target wasm32-unknown-unknown is accepted as a synonym for worker. If -o is omitted the output defaults to the input stem with a .worker.wasm extension (distinct from Tier 1’s .wasm, so building both targets doesn’t overwrite).

Run it locally

just wasm-worker-serve examples/deps.pl
# compiles + serves on workerd, then:
curl 'http://localhost:8080/?query=needs(app, X)'
# {"count":5,"exhausted":true,"output":"","solutions":[{"X":"auth"},…]}

The handler reads the goal from ?query= or, failing that, the POST body:

curl -X POST --data 'depends_on(app, D)' http://localhost:8080/

Response shape

The JSON is the v1 wire shape with one Tier-2 addition — an output field carrying anything the program wrote with write/1/writeln/1/nl/0 (an isolate has no stdout, so output is captured losslessly rather than lost):

{"count":1,"exhausted":true,"output":"","solutions":[{"X":"auth"}]}

The field is always present (empty when nothing was written), so a host can rely on its shape. Strip it to get bytes identical to a native --query --format json. Errors render exactly as native, as {"error":"…"}.

The buffer ABI

If you replace the glue, the contract the module exports is:

plg_init()                                   build the Machine once per isolate
plg_rt_alloc(len) -> ptr                      host writes the query bytes here
plg_rt_run_query(ptr, len, limit, steps, depth) -> (len<<32)|ptr   JSON result
plg_rt_free(ptr, len)                         host frees the query / result buffer

reactor.mjs is the single source of this dance and is imported by both the deployed worker.js and the test harness, so the two can’t drift. Free every buffer by its requested length (the allocator is length-keyed), and read the result view after plg_rt_run_query returns — solving can grow linear memory and detach an earlier view.

Concurrency contract

One in-flight query per isolate. The module keeps a single program Machine; a V8 isolate is single-threaded, but a Worker can interleave async tasks, so the host must not start a second query before the first returns. The emitted worker.js satisfies this automatically: its runQuery is fully synchronous (the only await is reading the POST body, before the ABI call), so concurrent fetch invocations serialize on the event loop.

Tuning

plg_rt_run_query’s last three arguments bound a query before the platform’s own CPU/wall limit cuts it off — they mirror the native knobs:

ArgNative equivalent0 means
limit--limitunbounded solutions
stepsPLG_MAX_STEPSmodule default (10,000)
depthPLG_METACALL_DEPTHmodule default (1,000)

The emitted worker.js calls runQuery(reactor(), query) with the module defaults. To raise the step ceiling for heavier programs, pass options — runQuery(reactor(), query, { stepLimit: 100_000_000n }) (the step limit is an i64, hence the BigInt). Depth matters more here than natively: a wasm isolate’s stack (~1 MB) is far smaller than a native one (~8 MB). Ordinary recursion and call/1 tail recursion are constant-stack regardless — a 1,000,000-deep call/1 returns in a V8 isolate via return_call.

Deploy to Cloudflare

The emitted wrangler.toml makes the module a deployable Worker. From the directory holding the build output:

plgc build --target worker deps.pl     # emits deps.worker.wasm + glue
npx wrangler deploy                     # uploads the Worker + wasm module

wrangler bundles worker.js and reactor.mjs and uploads the .wasm as a compiled module (the CompiledWasm rule in wrangler.toml). Then query the edge:

curl 'https://deps.<your-subdomain>.workers.dev/?query=needs(app, X)'

The isolate cold-starts once (atom table + registry built at plg_init), then every request is a warm in-memory call — no process fork, no per-request parse of the program. Edit the generated name/compatibility_date in wrangler.toml and the routing/limits in worker.js to taste; both are scaffolding, not regenerated.

Footprint

A reactor module is a few MB (the deps example is ~1.7 MB raw, well within the Workers budget and compressing further). Large fact tables inflate the module’s .rodata; a program whose data pushes past the isolate size cap belongs on Tier 1 / containers instead. The toolchain cost is entirely build-side — the deployed .wasm runs with only a wasm engine present.

How it works

The reactor reuses the same LLVM IR as native and Tier 1, retargeted to wasm32-unknown-unknown:

  • The module emits no main — instead an exported plg_init builds the Machine and hands it to the runtime; the host drives the buffer ABI exports.
  • llc -mattr=+tail-call lowers the engine’s musttail chains to return_call / return_call_indirect, so recursion and backtracking run in constant stack inside the isolate. A missing tail-call feature is a loud build/load error, never a silent overflow.
  • wasm-ld --no-entry links the program object against the wasm32-unknown-unknown runtime archive with no crt and no libc — the reactor touches no WASI, so the resulting module imports nothing and instantiates in a bare isolate.

Language Guide

patch-prolog implements an ISO subset of Prolog. A program is a set of facts and rules; you pose queries and the engine searches for proofs, binding variables along the way. This guide teaches the language as this engine implements it — including the places where it makes a deliberate, documented choice. For the complete operator table see the Operators reference; for every builtin and stdlib predicate, the Builtin & Stdlib Reference.

One thing to keep in mind throughout: this is a compiler. Your whole program is known at build time, so there is no assert/retract — the knowledge base is immutable. You change a program by editing its source and recompiling (the REPL’s :edit automates that loop).

Terms

Everything is a term. There are only a few kinds:

KindExamples
Atomtom, [], 'a quoted atom', +
Number42, -7, 3.14
VariableX, Dep, _, _Rest
Compounddepends_on(app, auth), point(1, 2), -(3)

A variable starts with an uppercase letter or _; everything else starting lowercase (or quoted) is an atom. This is the single most common beginner trip-up: Dep is a variable, depends_on is a predicate name.

A list is sugar for nested compounds: [a, b, c] is '.'(a, '.'(b, '.'(c, []))), and [] is the empty-list atom. The [H|T] notation splits a list into its head and tail.

Operators are sugar too: 2 + 3 is the compound +(2, 3), H = [X|T] is =(H, '.'(X, T)). Unification and resolution always see the compound form.

Facts, rules, and clauses

A fact asserts something true:

depends_on(app, auth).
depends_on(auth, crypto).

A rule says a head is true if its body is:

needs(X, Y) :- depends_on(X, Y).
needs(X, Y) :- depends_on(X, Z), needs(Z, Y).

Read :- as “if” and , as “and.” A predicate is identified by name and arity (needs/2), and its clauses are tried top to bottom. Multiple clauses are alternatives — here, needs is a direct dependency or a dependency reached through another.

A directive runs at load/build time:

:- dynamic(extra/1).

A query asks the engine to prove a goal:

./deps --query "needs(app, X)"

The engine searches for variable bindings that make the goal true, yielding solutions one at a time. With depends_on/2 and needs/2 above, needs(app, X) yields X = auth then X = crypto. If a goal can be proved more than one way, backtracking explores each alternative; if it can’t be proved at all, the query fails (no solutions).

Unification

=/2 makes two terms equal by binding variables:

?- point(X, 2) = point(1, Y).
% X = 1, Y = 2

No occurs check (by design). X = f(X) succeeds, producing a cyclic term — matching ISO’s default. Use unify_with_occurs_check/2 when you need the safe version that fails instead.

Related: \=/2 (not unifiable), and ==/2 / \==/2, which test structural identity without bindingX == Y is false for two distinct unbound variables, where X = Y would succeed.

Backtracking

When a goal has several solutions, the engine produces them lazily:

likes(sam, pizza).
likes(sam, sushi).
likes(ann, pizza).
?- likes(sam, Food).
Food = pizza ;
Food = sushi .

Composed goals backtrack jointly — likes(P, pizza), likes(P, sushi) searches for a P satisfying both.

Control constructs

FormMeaning
A, BConjunction — prove A, then B.
(A ; B)Disjunction — A, or B on backtracking.
(Cond -> Then ; Else)If-then-else — if Cond succeeds, commit and prove Then, else Else.
\+ GoalNegation as failure — succeeds iff Goal cannot be proved.
once(Goal)The first solution of Goal only.
call(Goal) / call(G, Extra...)Call a goal built at runtime.

The cut ! commits to the choices made so far in the current clause, discarding remaining alternatives. patch-prolog follows ISO: the cut is transparent through ,, ;, and ->. So with m(1). m(2). m(3).:

cut_demo(X) :- (m(X), X > 1, ! ; X = fallback).

?- cut_demo(X). yields exactly X = 2 — the cut prunes both the rest of m/1 and the ; X = fallback branch. (The cut is opaque inside \+, once, the condition of ->, and catch/3.)

Arithmetic

Arithmetic is explicit: is/2 evaluates its right side, and the comparison operators evaluate both sides.

?- X is 2 + 3 * 4.        % X = 14
?- 6 =:= 2 * 3.           % true   (=:= compares values; = would unify terms)

Division has several forms, and the defaults follow ISO — worth memorizing:

OpResultExample
/always a float (ISO 9.1.4)10 / 33.333…
//integer, truncates toward zero-7 // 2-3
divinteger, floors toward −∞-7 div 2-4
modremainder, sign of divisor-7 mod 32
remremainder, sign of dividend-7 rem 3-1

Other evaluable functions include - abs/1 min/2 max/2 sign/1, the bitwise operators (/\ \/ xor << >> \), and powers (** float, ^ integer-or-float). Use =:= =\= < > =< >= to compare.

Arithmetic is safe, not silent. Integer overflow raises evaluation_error(int_overflow) — it never wraps around. A NaN/infinite float result, or division by zero, is likewise a runtime error.

Lists

Lists use [H|T] to split head from tail, and the stdlib (compiled into every binary) provides the staples:

?- append([1,2], [3,4], L).      % L = [1,2,3,4]
?- member(X, [a,b,c]).           % X = a ; X = b ; X = c
?- length([a,b,c], N).           % N = 3

Also available: reverse/2, last/2, nth0/3, nth1/3. To collect every solution of a goal into a list, use findall/3; to enumerate a range of integers, between/3:

?- findall(D, needs(app, D), Ds).        % Ds = [auth, crypto]
?- between(1, 3, X).                      % X = 1 ; X = 2 ; X = 3

Inspecting and building terms

PredicatePurpose
functor(T, Name, Arity)Term’s functor and arity, or build one.
arg(N, T, A)The Nth argument of a compound.
T =.. List“Univ” — decompose/build a term as `[Functor
copy_term(T, Copy)A fresh copy with renamed variables.

Type tests classify a term: var/1, nonvar/1, atom/1, number/1, integer/1, float/1, compound/1, is_list/1.

Standard order of terms

Terms have a total order — Variables < Numbers < Atoms < Compounds (and a float sorts just before an equal-valued integer). Compare with @<, @>, @=<, @>= or compare/3, and sort with sort/2 (dedupes) or msort/2 (keeps duplicates).

Exceptions

throw/1 raises a term; catch(Goal, Catcher, Recovery) runs Recovery if Goal throws something that unifies with Catcher. Errors use the ISO taxonomy — error(Formal, Context) where Formal is type_error/2, existence_error/2, evaluation_error/1, instantiation_error, and so on.

Calling a predicate that is defined nowhere raises existence_error(procedure, Name/Arity) — catchable, per ISO. Because the compiler sees the whole program, it also warns about such calls at build time (see Compiler Usage and --deny-undefined).

What this subset leaves out

These are deliberate omissions, not gaps to be filled:

  • No assert/retract. The knowledge base is immutable — compiled at build time. :- dynamic(F/A) only makes an undefined predicate fail silently instead of raising existence_error; it does not enable runtime clause mutation.
  • No op/3. The operator table is fixed (see Operators).
  • No postfix operators, no module system, no DCG, no bagof/setof.

Safety guarantees

Every compiled program is bounded: a runaway computation hits a step limit and stops with resource_error(steps) — deliberately uncatchable, so a buggy clause can’t trap its own timeout. Combined with checked arithmetic and no runtime file I/O, a compiled binary is safe to run on untrusted input.

Operators

patch-prolog supports the standard prefix and infix operators of ISO Prolog. Postfix is not supported — see Postfix is unsupported, by design below.

op/3 (user-defined operators) is also not supported. The operator table below is the complete fixed set the engine recognizes.

Canonical form is always available

Every operator term is sugar for a compound. Anything you can write with an operator, you can write in functor notation:

Operator formCanonical form
2 + 3+(2, 3)
- a-(a)
\+ Goal\+(Goal)
H = [X|T]=(H, '.'(X, T))
X is 6 /\ 3is(X, /\(6, 3))

The arithmetic evaluator, unification, and clause resolution see the compound form — operator syntax is purely a reader-level convenience. If you ever hit a parse-precedence surprise, the canonical form will always parse unambiguously.

Prefix operators

OpTypePrecMeaning
-fy200Arithmetic negation. Folds for literal numbers: -3 reads as the integer -3, not the compound -(3).
+fy200Identity / explicit positive sign. Folds for literal numbers: +3 reads as 3. For non-literals, builds +(X).
\fy200Bitwise complement (evaluable in is/2). No folding — \3 reads as the compound \(3).
\+fy900Negation as failure (control construct).

! (cut) is a 0-arity atom, not a prefix operator — it does not take an argument.

The clause and directive markers :- and ?- are recognized by the parser but are not user-redefinable operators:

  • Head :- Body. — clause separator (infix at the top level of a program).
  • :- Directive. — directive prefix (e.g. :- dynamic(foo/1).).
  • ?- Goal. — query prefix (optional, accepted by the query reader).

Infix operators

Listed from loosest binding to tightest. Associativity uses the standard fixity codes (xfx non-associative, xfy right-associative, yfx left-associative).

OpTypePrecMeaning
;xfy1100Disjunction (also the if-then-else outer in (C -> T ; E)).
->xfy1050If-then. Only meaningful inside a parenthesized goal expression.
,xfy1000Conjunction.
=xfx700Unification.
\=xfx700Not-unifiable.
==xfx700Term identity (no unification).
\==xfx700Term non-identity.
isxfx700Arithmetic evaluation.
<xfx700Arithmetic less-than.
>xfx700Arithmetic greater-than.
=<xfx700Arithmetic less-or-equal. (Note: =<, not <=.)
>=xfx700Arithmetic greater-or-equal.
=:=xfx700Arithmetic equality.
=\=xfx700Arithmetic inequality.
@<xfx700Standard term ordering: less.
@>xfx700Standard term ordering: greater.
@=<xfx700Standard term ordering: less-or-equal.
@>=xfx700Standard term ordering: greater-or-equal.
=..xfx700Univ: term ↔ list decomposition.
+yfx500Addition.
-yfx500Subtraction.
/\yfx500Bitwise AND.
\/yfx500Bitwise OR.
xoryfx500Bitwise XOR.
*yfx400Multiplication.
/yfx400Float division (ISO 9.1.4 — always returns a float, even for integer operands).
//yfx400Integer division (truncates toward zero).
modyfx400ISO modulo (sign follows divisor).
remyfx400ISO remainder (sign follows dividend).
divyfx400Floor division (rounds toward −∞; differs from // when signs disagree).
<<yfx400Shift left (integer).
>>yfx400Shift right (integer).
**xfx200Float power (always returns a float).
^xfy200Integer power for Int ^ Int with non-negative exponent; float otherwise.
:xfy200Module qualifier (parser-only; no module system, but the term parses and round-trips).

Postfix is unsupported, by design

patch-prolog recognizes no postfix operators and none are planned.

Three reasons:

  1. Postfix is pure syntactic sugar — zero expressive power. A term written with a hypothetical postfix yf operator (X yf) is exactly the compound yf(X). The form unification and resolution see is identical to what functor notation already gives you.
  2. Standard Prolog ships no built-in postfix operators. ISO 13211-1 defines the postfix types (xf, yf) so a program could declare one via op/3, but the standard operator table itself contains zero postfix entries.
  3. op/3 is not supported. Without op/3 there is no path to a user-defined operator of any fixity, so the choice is moot anyway.

Teaching materials sometimes present postfix alongside prefix and infix for symmetry. Treat that as a description of the language family; this engine is a deliberate subset.

Known quirks

  • Literal folding. -3 and +3 read as the integer literals -3 and 3, not the compounds -(3) and +(3). To force the compound form (e.g. when pattern-matching with functor/3), use a non-numeric operand: - a, + X. \3 is not folded — it parses as \(3).
  • ! is an atom, not an operator. ! X is a syntax error; if you want cut followed by X, write !, X.
  • -> outside parens. The if-then operator is only meaningful inside a parenthesized goal expression. ( Cond -> Then ; Else ) is the idiom; bare Cond -> Then at clause-body level may not parse the way you expect.
  • / is always float. Per ISO 9.1.4, 10 / 3 evaluates to 3.3333..., not 3. Use // (truncating) or div (floor) for integer division.
  • =< vs <=. The Prolog convention is =<. Writing <= is a parse error.

See also

  • Language Guide — how these operators are used (terms, arithmetic, control, unification).
  • Builtin & Stdlib Reference — the predicates the comparison, arithmetic, and term-order operators name.
  • ISO/IEC 13211-1:1995, §6.3.4 (operator notation) and §8.7 (arithmetic).

Builtin & Stdlib Reference

Every predicate the engine provides. The builtins are compiled into the language itself; the standard library is a small set of list predicates compiled into every binary. The descriptions here mirror the engine’s builtin table (plg-shared::BUILTINS) verbatim — a test fails the build if this page and the table disagree.

Evaluable arithmetic functions (+, *, mod, abs, …) are not predicates; see the Language Guide and the Operators reference for those and for operator precedence.

Type checks

PredicateDescription
var/1Type check: succeeds if argument is an unbound variable.
nonvar/1Type check: succeeds if argument is bound.
atom/1Type check: succeeds if argument is an atom.
number/1Type check: succeeds if argument is an integer or float.
integer/1Type check: succeeds if argument is an integer.
float/1Type check: succeeds if argument is a float.
compound/1Type check: succeeds if argument is a compound term.
is_list/1Type check: succeeds if argument is a proper list.

Unification & term comparison

PredicateDescription
=/2Unification: X = Y succeeds if X and Y can be made identical.
\=/2Not-unifiable: succeeds when = would fail.
==/2Term identity: structural equality without unification.
\==/2Term non-identity.
unify_with_occurs_check/2Unification with occurs check: rejects X = f(X)-style cycles.
compare/3compare(Order, T1, T2) — bind Order to <, =, or > per standard term ordering.
@</2Standard term ordering: less.
@>/2Standard term ordering: greater.
@=</2Standard term ordering: less-or-equal.
@>=/2Standard term ordering: greater-or-equal.

Arithmetic

PredicateDescription
is/2Arithmetic evaluation: X is Expr binds X to the value of Expr.
=:=/2Arithmetic equality.
=\=/2Arithmetic inequality.
</2Arithmetic less-than.
>/2Arithmetic greater-than.
=</2Arithmetic less-or-equal (note: =<, not <=).
>=/2Arithmetic greater-or-equal.

Control

PredicateDescription
,/2(A, B) — conjunction: prove A, then B.
;/2(A ; B) — disjunction: prove A, or B on backtracking. (C -> T ; E) reads as if-then-else.
->/2(C -> T) — if-then: if C succeeds (committing to its first solution), prove T; otherwise fail.
\+/1Negation as failure: succeeds when its argument fails.
once/1once(Goal) — succeed at most once for Goal.
call/1Meta-call: execute its argument as a goal. Variadic — extra args are appended.
true/0Always succeeds.
fail/0Always fails.
false/0Always fails (alias for fail).
!/0Cut: commit to current choices; remove choice points back to the parent clause.

Finding solutions & enumeration

PredicateDescription
findall/3findall(Template, Goal, List) — collect all solutions of Goal.
between/3between(Low, High, X) — enumerate or test integers in [Low, High].

Exceptions

PredicateDescription
catch/3catch(Goal, Catcher, Recovery) — run Goal; on thrown error matching Catcher, run Recovery.
throw/1Raise an error term that propagates to the nearest matching catch/3.

Term construction & inspection

PredicateDescription
functor/3functor(Term, Name, Arity) — inspect or construct a term’s functor.
arg/3arg(N, Term, Arg) — extract the N-th argument of Term.
=../2Univ: T =.. L decomposes T into a list of its functor and args.
copy_term/2copy_term(T, C) — bind C to a copy of T with fresh variables.

Atoms & numbers

PredicateDescription
atom_length/2atom_length(A, L) — bind L to the length of atom A.
atom_concat/3atom_concat(A, B, C) — concatenate A and B into C, or, with C bound and A/B unbound, nondeterministically split C into every prefix/suffix pair.
atom_chars/2atom_chars(A, Chars) — convert between an atom and a list of single-char atoms.
number_chars/2number_chars(N, Chars) — convert between a number and a list of single-char atoms.
number_codes/2number_codes(N, Codes) — convert between a number and a list of character codes.

Arithmetic relations

PredicateDescription
succ/2succ(X, S) — Peano successor relation; S = X + 1, both non-negative.
plus/3plus(X, Y, Z) — addition relation; any one argument may be unbound.

Sorting

PredicateDescription
msort/2msort(L, Sorted) — sort without removing duplicates.
sort/2sort(L, Sorted) — sort and remove duplicates.

I/O

PredicateDescription
write/1Write a term to stdout (no newline).
writeq/1Write a term to stdout, quoting atoms so it reads back (no newline).
writeln/1Write a term to stdout followed by a newline.
nl/0Write a newline to stdout.

Standard library (lists)

Defined in Prolog (stdlib.pl) and compiled into every binary.

PredicateDescription
member/2member(X, List) — succeeds once for each element of List.
append/3append(A, B, C) — C is A concatenated with B (relational; works in reverse).
length/2length(List, N) — N is the number of elements in List.
last/2last(List, X) — X is the last element of List.
reverse/2reverse(List, Rev) — Rev is List in reverse order.
nth0/3nth0(N, List, X) — X is the element at zero-based index N.
nth1/3nth1(N, List, X) — X is the element at one-based index N.

Semantics & ISO Conformance

patch-prolog implements an ISO subset. This page is the precise statement of how it behaves where the standard leaves room for choices, the few deliberate divergences, and the safety guarantees layered on top. The Language Guide teaches these with examples; this is the reference, with ISO section numbers.

Unification (ISO 8.3.2)

  • =/2 does not perform the occurs check — X = f(X) succeeds, creating a cyclic term, which is the ISO default. Solution display handles cycles safely.
  • unify_with_occurs_check/2 is the safe variant: it fails on X = f(X)-style cycles instead of building them.

Float equality

  • Unification treats floats by bit pattern, so NaN unifies with NaN.
  • This differs from arithmetic comparison, where any comparison involving NaN is false (=:=, <, etc.).

Arithmetic

  • mod uses ISO floored semantics — the result takes the sign of the divisor (-7 mod 3 is 2), not the truncated remainder. (rem takes the sign of the dividend.)
  • Integer overflow is a runtime error (evaluation_error(int_overflow)), never a silent wraparound.
  • A NaN or infinite float result from any operation is a runtime error.
  • Division by zero (integer or float) is a runtime error.

Standard order of terms (ISO 8.4.2)

Variables < Numbers < Atoms < Compounds
  • A float sorts just before an integer of equal value (1.0 @< 1).
  • NaN sorts after all other floats (a deterministic total order).

Built-in error behavior

  • number_chars/2 and number_codes/2 raise a syntax error on a non-numeric string (not silent failure), and reject NaN/infinity.
  • T =.. [F] with F unbound raises an instantiation error (ISO 8.5.3); an empty list is an error, not a failure.
  • functor/3 with a negative arity is an error.
  • atom_chars/2 is for atoms only; number_chars/2 handles numbers.
  • When is/2 (or an arithmetic comparison) meets a non-evaluable functor, the type_error(evaluable, Culprit) culprit is a predicate indicator Name/Arity (ISO 8.6) — so a bare atom foo reports foo/0, exactly like the compound case foo(1) reports foo/1. (The original v1 interpreter reported the bare atom for the arity-0 case.)

Exceptions

  • catch/3 and throw/1 (ISO 7.8.9, 7.8.10) use the ISO error taxonomy: error(Formal, Context), where Formal is one of instantiation_error, type_error/2, existence_error/2, domain_error/2, evaluation_error/1, permission_error/3, representation_error/1, resource_error/1, or syntax_error.
  • throw/1 of an unbound variable raises instantiation_error.
  • catch/3 is opaque to cut: a ! inside the goal cannot escape the catch frame.

Cut (ISO 7.8.4)

! is transparent through ,, ;, and ->: a cut inside a disjunction branch cuts the whole clause, including the other branch and the predicate’s remaining clauses. With m(1). m(2). m(3).:

t(X) :- (m(X), X > 1, ! ; X = fallback).

?- t(X). yields exactly X = 2. Cut remains opaque inside \+/1, once/1, the condition of ->, and catch/3.

If you used the original patch-prolog interpreter, note this is a change: it treated the cut as local to the disjunction branch (and also produced X = fallback). patch-prolog follows ISO here, matching SWI and GNU Prolog.

Dynamic and undefined predicates (ISO 7.7.3)

  • :- dynamic(F/A). declares a predicate whose clauses may be absent. An undefined dynamic predicate fails silently; an undefined non-dynamic predicate raises existence_error(procedure, F/A).
  • A direct call to a predicate defined nowhere is still a well-formed program: it compiles and raises that (catchable) existence_error if reached. Because patch-prolog is a whole-program compiler, such a call can never succeed, so plgc build/run/check warn about it and plgl shows a squiggle. --deny-undefined promotes the warning to a hard error — opt-in strictness above the ISO-compliant default, not a divergence from it. See Compiler Usage.

Safety guarantees (beyond ISO)

  • A runaway computation hits a step limit and stops with resource_error(steps). This error is deliberately uncatchable, so a buggy clause cannot trap its own timeout. This is a safety guarantee, not an ISO requirement.

Not implemented

These are deliberate omissions of the subset:

  • assert/1, retract/1 — the knowledge base is immutable (compiled at build time); :- dynamic only enables silent-fail, not runtime mutation.
  • op/3 — the operator table is fixed; see Operators.
  • Module system.
  • Definite clause grammars (DCG).

REPL Guide

plgr is an interactive Prolog loop — define clauses, run queries, walk solutions — built on the same compiler as everything else. It does not interpret: every program change is compiled to a fresh native binary, and queries run against that binary. What you test in the REPL is exactly what plgc build would ship.

Because it drives the compiler, plgr needs plgc on your PATH (or set $PLGC to its location). Start it with no arguments:

plgr
plgr — patch-prolog REPL.  :help for commands, :quit to exit.
plg> ▮

Two kinds of input

The prompt is plg> — deliberately neutral, because it accepts two different things:

  • A clause, rule, or directive — anything ending in . that isn’t a query. This edits your program and recompiles.
  • A query?- Goal. This runs against the current program and shows solutions. It does not change anything.

You type the ?- yourself; that’s what tells the REPL “run this” rather than “remember this.”

plg> depends_on(app, auth).
  defined.  (1 in session)
plg> ?- depends_on(app, X).
  X = auth .

(Plus :-prefixed commands, covered below.)

Defining your program

Type facts, rules, and directives as you would write them in a file:

plg> depends_on(app, ui).
  defined.  (2 in session)
plg> needs(X, Y) :- depends_on(X, Y).
  defined.  (3 in session)
plg> :- dynamic(extra/1).
  defined.  (4 in session)

Each accepted entry recompiles the program and confirms with the running clause count. A rule spanning several lines continues under the | prompt until a line ends in .:

plg> needs(X, Y) :-
|      depends_on(X, Z),
|      needs(Z, Y).
  defined.  (5 in session)

Multiple clauses for the same predicate accumulate as alternatives — the knowledge base is immutable, so there is no in-place “redefine.” To change or remove a clause, edit the whole session with :edit.

Capitalization matters. A name starting with a capital letter is a variable, so Needs(app, X) :- ... is not a clause — the REPL says so and suggests needs. Predicate and atom names start lowercase.

Running queries

A ?- Goal. query runs against the current program and reveals solutions one at a time, SWI-style:

plg> ?- depends_on(app, X).
  X = auth ;

The trailing ; means “there may be more.” Press ; or space to see the next solution; press any other key (Enter, ., …) to stop. The last solution ends with .:

  X = auth ;
  X = ui .

A goal with no solutions prints false.; a ground goal that succeeds prints true.. Solutions are fetched in batches and the REPL transparently fetches more as you page past one, up to a cap — a query enumerating thousands of answers stays responsive.

Editing and history

The input line is a vim-style editor (the vim-line crate).

  • It starts in insert mode — just type. Press Esc for normal mode (motions h l w b 0 $, edits x dd C …); a dim -- NORMAL -- marker appears bottom-right so you always know the mode. i/a/etc. return to insert.
  • History: Up/Down arrows (any mode) or k/j (normal mode) walk previously submitted lines. Your in-progress line is stashed and restored if you walk back down past the newest entry. History persists across sessions in $HOME/.local/share/plgr_history.
  • Completion: Tab completes the word under the cursor against the builtins, the stdlib, and the predicates you’ve defined this session.

Commands

Commands are single-line and start with :. Most have a short alias.

CommandAliasEffect
:load FILE:lConsult a .pl file — its clauses are added to the session in order.
:list:lsShow the session buffer, one numbered clause per entry.
:edit:eOpen the whole session in $EDITOR; on save it reloads and recompiles.
:save FILEWrite the current session buffer to FILE.
:resetClear the session (start over).
:help:hShow the command summary.
:quit:qExit. (Ctrl-D or Ctrl-C also quit.)

:edit is the way to change or delete clauses: it hands the buffer to your editor as a .pl file, then atomically reloads it. A syntactically broken edit is rejected and your previous session is left intact.

How it works

Defining a clause recompiles the session buffer to a fresh temporary native binary; a ?- query re-invokes that current binary. So the cost model is:

  • Editing the program pays a (small) compile each time.
  • Querying is free of compilation — it just runs the binary you already built.

This is the whole point: the REPL gives an interactive feel by driving the compiler, never by interpreting clauses. A divergent or runaway query is bounded by a timeout and killed without taking the REPL down with it.

Configuration

VariablePurpose
PLGCPath to the plgc compiler, if not on PATH.
VISUAL / EDITOREditor for :edit (tries $VISUAL, then $EDITOR, then vi).
PLG_REPL_TIMEOUTSeconds before a query subprocess is killed (default 10).

History lives at $HOME/.local/share/plgr_history.

LSP & Editor Guide

plgl is a Language Server for the patch-prolog language. It gives any LSP-capable editor real-time diagnostics, completion, hover docs, and go-to-definition over stdio. Unlike plgc, the server links only the parser — no clang, no runtime — so it’s small and starts instantly.

What it provides

  • Diagnostics — syntax/parse errors as you type, plus warnings for calls to predicates defined nowhere, with did-you-mean suggestions (e.g. a typo’d parnet/2 suggests parent/2). These are warnings in the editor; to make them hard build failures, use plgc --deny-undefined (see Compiler Usage).
  • Completion — builtins, the stdlib, and the predicates defined in the current buffer.
  • Hover — a one-line doc for builtins, and the clause head for user-defined predicates.
  • Go-to-definition — jumps to a predicate’s first clause.

Completion, hover, and the undefined-predicate check all draw on the same sources as the compiler (the shared builtin table, the stdlib, the frontend parser), so the editor agrees with what plgc accepts.

Installing plgl

The toolchain isn’t on crates.io yet, so build from source. The server needs only the parser, so you can install just it:

git clone https://git.navicore.tech/navicore/patch-prolog
cd patch-prolog
cargo install --path crates/lsp

Or install the whole toolchain at once:

just install        # plgc, plgr, plgl

Verify with plgl --version. The server speaks LSP over stdio; you don’t run it directly — your editor launches it.

Neovim

The companion plugin patch-prolog.nvim wires up the LSP along with filetype detection (.pl, .plg, .proprolog), syntax highlighting, and indentation. It needs Neovim 0.10+ and plgl on your PATH.

With lazy.nvim:

{
  "navicore/patch-prolog.nvim",
  ft = "prolog",
  opts = {},
}

Or configure it directly:

require("plgl").setup({
  -- cmd = { "plgl" },              -- override the binary path/args if needed
  on_attach = function(client, bufnr)
    local map = vim.keymap.set
    map("n", "gd", vim.lsp.buf.definition, { buffer = bufnr })
    map("n", "K",  vim.lsp.buf.hover,      { buffer = bufnr })
  end,
})

Neovim’s built-in detection maps .pl to Perl; the plugin overrides that to prolog.

Any other LSP client

plgl is a standard stdio language server — any client can drive it with three things:

  • Command: plgl (no arguments; communicates over stdin/stdout).
  • Language / filetype: prolog.
  • Root directory: the project root — .git is the usual marker.

For example, a bare Neovim client without the plugin:

vim.lsp.start({
  name = "plgl",
  cmd = { "plgl" },
  root_dir = vim.fs.root(0, { ".git", "Cargo.toml" }),
})

The same three parameters translate to VS Code, Emacs eglot/lsp-mode, Helix (languages.toml), and others.

Known limitations

  • Hover and go-to-definition work on identifier names. Operator-named builtins (=.., @<, \+) aren’t hoverable — the cursor-word extraction only spans identifier characters.
  • Diagnostics are computed per file from the open buffer; cross-file project resolution is not modeled (a predicate defined in another file the buffer doesn’t include reads as undefined).

Examples

The examples/ directory has runnable programs. Two are walked through here: a dependency graph that shows the language basics, and a schema linter that shows the headline use case — compiling a rule set into a standalone checker.

Build any example and query it:

plgc build examples/deps.pl -o deps
./deps --query "needs(app, X)" --format text

deps.pl — facts, rules, and recursion

depends_on(app, auth).
depends_on(app, ui).
depends_on(auth, crypto).
depends_on(ui, render).
depends_on(render, crypto).

% needs/2 — the transitive closure of depends_on.
needs(X, Y) :- depends_on(X, Y).
needs(X, Y) :- depends_on(X, Z), needs(Z, Y).

% two components that share a direct dependency.
shares_dep(A, B) :- depends_on(A, D), depends_on(B, D), A \= B.

A handful of depends_on/2 facts describing a build graph, then rules that derive new relations. needs/2 is recursive — what a component depends on, transitively (a direct dependency, or a dependency of one). And shares_dep/2 uses \=/2 to exclude a component matching itself.

Querying derives answers from the rules — and backtracking yields every solution:

./deps --query "depends_on(app, X)" --format text   # direct dependencies
# X = auth
# X = ui

./deps --query "needs(app, X)" --format text         # transitive
# X = auth
# X = ui
# X = crypto
# X = render
# X = crypto

./deps --query "shares_dep(auth, render)" --format text
# true.

needs(app, X) reaches crypto twice — once through auth, once through ui → render — because there are two paths to it; backtracking finds both. shares_dep(auth, render) holds because both depend on crypto.

Use findall/3 to collect every solution into a list:

./deps --query "findall(D, needs(app, D), Ds)" --format text
# D = _0
# Ds = [auth, ui, crypto, render, crypto]

(D is the template — left unbound; Ds is the collected result.)

linting.pl — compiling a checker into a binary

This is the use case patch-prolog is built for: write the rules, compile them to a single native binary, and run it anywhere with no Prolog system installed. Here the “data” (a schema’s fields) is baked in at build time and the rules flag violations.

field(user, id, integer).
field(user, name, string).
field(user, email, string).
field(user, password, string).
field(user, ssn, string).

allowed_type(string).  allowed_type(integer).
allowed_type(boolean). allowed_type(array).

sensitive(ssn).
sensitive(password).

violation(Field, sensitive_field) :-
    field(user, Field, _),
    sensitive(Field).

violation(Field, unknown_type) :-
    field(user, Field, Type),
    \+ allowed_type(Type).
plgc build examples/linting.pl -o linting
./linting --query "violation(Field, Reason)" --format text
# Field = password
# Reason = sensitive_field
# Field = ssn
# Reason = sensitive_field

The real power is the exit code: a CI step doesn’t need to parse output at all — a non-zero exit means violations were found.

if ./linting --query "violation(_, _)" >/dev/null; then
    echo "schema violations found"; exit 1
fi

And for tooling that does want the data, the default JSON format is ready to parse:

./linting --query "violation(F, R)" --format json
# {"count":2,"exhausted":true,"solutions":[{"F":"password","R":"sensitive_field"},
#                                          {"F":"ssn","R":"sensitive_field"}]}

That linting binary is ~700K, depends only on system libc/libm, and needs no Prolog runtime to run — hand it to anyone.

Where to next

Architecture

patch-prolog compiles an ISO-subset Prolog program to a standalone native binary. The compiled binary contains no clause interpreter: predicates are native functions generated as LLVM IR; only primitive services (heap, trail, unification, builtins, query parsing, output) come from a runtime library statically linked into the binary.

Pipeline

rules.pl ──parse──▶ AST ──analyze──▶ codegen ──▶ rules.ll (LLVM IR text)
                                                    │
        libplg_runtime.a (embedded in plgc) ──┐     │
                                              ▼     ▼
                          clang -O3 -g rules.ll -lplg_runtime -lm
                                              │
                                              ▼
                                   rules  (standalone binary)
  • The compiler emits LLVM IR as text — no llvm-sys/inkwell binding, no LLVM version lock-in beyond “clang ≥ 15” (opaque pointers).
  • libplg_runtime.a is built from crates/runtime and embedded into the plgc binary via include_bytes! (set up by crates/compiler/build.rs, which also enforces exact version match between compiler and runtime, and bakes in a content hash of the archive). At link time it is materialized at a content-addressed cache path ($XDG_CACHE_HOME/plgc/runtime-<hash>/, else ~/.cache/plgc/…) that every run of the same build reuses; stale entries are age-swept. HOME-less environments fall back to a private per-link extraction that is removed after linking.
  • -Wl,--gc-sections (Linux) / -Wl,-dead_strip (macOS) strips runtime code the program can’t reach, keeping binaries small.
  • Users of plgc need clang. Users of compiled binaries need nothing.

This is the architecture proven by patch-seq: a compiled binary contains no clause interpreter. The rejected alternative — embedding a serialized clause database inside a shipped interpreter — would put the whole interpreter (and a Rust runtime) into every “compiled” program, and is why this engine generates native code per predicate instead.

Crates

CrateArtifactRole
plg-sharedrlibAtomId + well-known atoms, Term, StringInterner, FirstArgKey, operator table. Linked into BOTH compiler and runtime — zero dependencies, by rule.
plg-frontendrlibTokenizer + operator-precedence parser + ISO error types (ported from v1). Compiler-side only.
plg-runtimestaticlib + rlibThe machine substrate compiled code calls into: heap/trail/choice points, generic unify, ~60 builtins, the minimal goal-only --query parser, JSON/text output, process entry. Ships inside every compiled binary.
plg-compilerbin plgc + rlibCLI, codegen (IR text emission), clang driver, runtime embedding.
plg-lspbin plglLanguage server (diagnostics, completion, hover, goto-definition). A frontend consumer — never links the runtime.
plg-replbin plgrInteractive REPL that drives the compiler; never interprets.

Dependency rule: nothing heavy (clap, serde, …) may enter plg-runtime or plg-shared; every byte there lands in every user binary. (The compiler-side crates — plg-frontend, plg-compiler, plg-lsp, plg-repl — are dev tooling and carry no such constraint.) This is a strong default, not an absolute: a dependency that demonstrably pays for its bytes against the footprint gate, or is scoped to compiler-side crates, can be considered.

Execution model (summary)

  • Each predicate compiles to one LLVM function in continuation-passing style: it receives the Machine pointer, its arguments as tagged 64-bit words, and a success continuation. Solutions are delivered by musttail-calling the continuation; failure is a plain return.
  • Alternatives (untried clauses, disjunction branches) live on a runtime-managed choice-point stack holding retry function pointers plus heap/trail marks. Backtracking = rewind marks + tail-call retry.
  • Cut truncates the choice-point stack to the barrier recorded at predicate entry (stopping at catch frames).
  • All transfers are tail calls and continuation frames are heap-allocated, so Prolog recursion depth never grows the C stack; determinate last-goal recursion is a true jump.
  • Backtracking resets the heap top to the choice point’s mark — memory reclamation without GC.

Runtime --query support

The compiler bakes two global tables into every binary:

  • the atom table (all atom names, in id order), and
  • the predicate registry {functor_id, arity, fn_ptr}.

At startup the runtime rebuilds the name→id map from the atom table, so a runtime-parsed query interns into the same id space (new atoms get fresh ids that correctly unify with nothing in the program). The registry maps a parsed goal to its compiled entry point — this is also how call/1 and findall/3 re-enter compiled code. Predicates declared :- dynamic with no clauses are registered to an always-fail stub (silent-fail linter contract); unknown predicates raise existence_error(procedure, F/A).

Wire contract (compiled binaries)

Preserved exactly from v1 so existing harnesses keep working:

  • --query "goal(X)", --limit N, --format json|text
  • exit 0 no solutions · 1 solutions found · 2 query parse error · 3 runtime error
  • JSON: {"solutions":[{"X": ...}], "count": N, "exhausted": bool}

Build system

justfile is the source of truth; CI (.forgejo/workflows/ci-linux.yml) only calls just ci. Recipe ordering matters: build runs build-runtime before build-compiler so the canonical target/release/libplg_runtime.a is fresh when build.rs embeds it.

Compilation Model

How Prolog’s nondeterministic control becomes native code. This sits between the Architecture overview and the Runtime ABI symbol-level contract: it explains the shape of the generated code — the function signatures and exact symbols are in the Runtime ABI.

The shape: CPS + a choice-point trampoline

Each predicate compiles to one LLVM function with a uniform signature, i32 (ptr %m, i64 %env) (returns 0 = fail/exhausted, 1 = stop). Forward execution is continuation-passing: a goal installs the “rest of the conjunction” as a continuation and musttails into the callee; a solution is delivered by musttailing that continuation.

Because every function shares the one signature, musttail is guaranteed tail-call optimization under the ordinary C calling convention — so recursion and backtracking never grow the C stack.

This is the simplest design that is genuinely compiled. The alternatives:

  • WAM-instruction emission (GNU Prolog’s path) — fastest and most faithful, but needs the full WAM register/environment design up front. A named future escape hatch, not the v1 target.
  • A pure choice-point trampoline (success also routed through the stack) — simple, but bounces once per goal, forfeiting the musttail win on determinate iteration.
  • Interpretation with embedded data — rejected permanently; that was the original patch-prolog’s mistake (a compiler in costume over an interpreter).

Choice points and backtracking

Alternatives — untried clauses, disjunction branches — become entries on a runtime choice-point stack, each snapshotting the trail, heap top, and variable counter, plus a retry function. Failure is ret i32 0: it unwinds the constant-depth tail-call chain to the driver loop (solve.rs), which pops a choice point, rewinds the trail (undo bindings) and the heap top (backtrack-deallocation — memory reclaimed with no GC), and musttails the retry. The driver is the trampoline; depth never grows the C stack.

Clause groups push lazily: a predicate’s entry tries clause 1 and pushes one choice point whose retry is clause 2, which pushes clause 3, and so on. First-argument indexing compiles to a switch on the first argument’s tag/value (the same keys the v1 interpreter used); a single keyed candidate pushes no choice point.

Cut

The barrier is the choice-point stack height recorded at predicate entry. ! compiles to a runtime call that truncates the stack back to that height, stopping at catch frames (cut is opaque to catch). If-then-else and once/1 are local barriers around the condition; \+ G is (G -> fail ; true).

catch / throw

There is no native stack unwinding to fight: continuations are heap frames and transfers are tail calls, so throw/1 is pure data. The runtime walks the choice-point stack for the topmost catch frame whose catcher unifies with the thrown ball, rewinds the trail and heap, and tail-calls the recovery continuation. The step-limit ball is uncatchable — it skips every catch frame and exits with code 3.

What is compiled vs. called

Compiled inline, in the clause code itself:

  • clause dispatch (the first-argument-indexing switch),
  • conjunction sequencing, disjunction branching, and the if-then-else / once / \+ commit machinery (choice points + commit-height slots),
  • cut,
  • atom and immediate-integer constants (their tagged words are baked into the IR).

Runtime calls — the pragmatic scope line; inlining any of these is a named escape hatch if profiles ever demand it (symbols in the Runtime ABI):

  • generic unification, including head unification — one shared bind/deref/trail implementation so compiled and runtime paths can’t diverge,
  • is/2 and arithmetic comparison (the v1 evaluator, with its exact error strings),
  • the deterministic builtin library, plus findall/3, catch/3, throw/1, the metacall path, and between/3.

Worked example

ancestor(X, Z) :- parent(X, Z).
ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z).
  • ancestor/2’s entry pushes one choice point (retry = clause 2’s body) and musttails into parent/2 with the caller’s continuation — clause 1’s body goal is in tail position.
  • Clause 2’s retry allocates a fresh Y, builds a continuation frame for ancestor(Y, Z), and musttails parent(X, Y) with it.
  • That continuation musttails ancestor/2 again — the recursive call is a true jump, so a million-deep ancestor chain runs in constant C stack.
  • When everything is exhausted the outermost call returns 0; the driver reports the solution count and exit code.

Guarantees and guards

riskguard
musttail emission bug → stack growthcodegen post-check (every musttail is followed by a bare ret) + a deep-recursion integration test
compiled vs. generic unify divergenceshared primitives + differential tests against the v1 interpreter oracle
atom-id divergence compile ↔ runtimea single emitted atom table + round-trip unit tests
runaway heap in determinate queriesthe uncatchable step limit (copying GC is a future escape hatch)

Runtime ABI

The C-ABI surface between plgc-generated LLVM IR and libplg_runtime.a. Rust side: crates/runtime/src/abi.rs + entry.rs; IR side: the RUNTIME_DECLS block in crates/compiler/src/codegen/program.rs. Those two files are the contract; this document explains it. The build.rs version-sync guarantees an embedded runtime always matches the compiler that emits calls into it.

The uniform function signature

Every compiled predicate, clause, chain (“try next clause”), continuation, and retry function — and every runtime-provided continuation — shares one C-ABI prototype:

i32 (ptr %machine, i64 %env)

Returns 0 = fail/exhausted (the driver loop backtracks), 1 = stop (solution limit reached / enumeration finished early). The uniform prototype is what makes musttail transfers guaranteed-TCO under the C calling convention (same trick protobuf’s parser uses), and it lets Rust functions (the solve driver, the capture continuation, ,/2 at query level) participate at chain boundaries via plain function pointers.

Function pointers cross the boundary as i64 (ptrtoint/inttoptr in IR, transmute in Rust); frames and environments are heap indices, never raw pointers — the heap is a growable Vec<u64> and indices survive reallocation and backtracking truncation.

Control flow contract

  • Forward execution: generated code installs a continuation with plg_rt_set_k and musttails into the callee’s entry function. Solution delivery is a musttail into the installed k.
  • Failure: ret i32 0 unwinds the (constant-depth) tail-call chain to the runtime driver loop (solve.rs), which pops a choice point, rewinds trail+heap to its marks, and calls its retry function. The driver is the trampoline; the C stack never grows with recursion or backtracking depth.
  • Choice points: plg_rt_push_cp(m, retry_fn, env) snapshots trail/heap marks at push time. Predicate entries push one CP per remaining clause group lazily (entry pushes t1; t1 pushes t2; …).
  • Frames (predicate: [args…, k_fn, k_env]; clause body: [k_fn, k_env, vars…]) are allocated with plg_rt_frame_alloc before any CP that must see them — heap rewind frees everything allocated after the CP, so ordering is the correctness rule.

Generated global data

@plg_atom_strs    = [N x ptr]              ; atom names, id order
@plg_registry     = [K x { i32, i32, ptr }] ; functor, arity, entry fn

plg_rt_init re-interns the atom names in order, reproducing the compiler’s exact id space (both sides pre-seed the same well-known atoms); query atoms unseen at compile time get fresh ids ≥ N and correctly unify with nothing. The registry is sorted by (functor, arity) for binary search; :- dynamic predicates with no clauses point at @plg_rt_pred_fail (silent-fail contract).

Term words (tag in low 3 bits)

REF=0 (heap idx, unbound = self-ref) · ATOM=1 (id) · INT=2 (immediate i61; integers beyond the i61 range are boxed as TAG_BIG, below) · STR=3 (idx → [functor:u32|arity:u32][args…]) · LST=4 (idx → [head][tail]) · FLT=5 (idx → f64 bits, to_bits equality) · TAG_BIG=6 (idx → boxed i64, for integers beyond the i61 immediate).

Atoms and immediate integers never cross the ABI as calls: codegen emits their tagged words as compile-time constants ((id<<3)|1, (n<<3)|2 — mirrored in cell.rs and term_emit.rs, covered by tests on both sides).

The plg_rt_* surface

symbolsignature (IR)purpose
plg_rt_initptr (ptr, i32, ptr, i32)build Machine from atom table + registry
plg_rt_maini32 (ptr, i32, ptr)argv parse, query parse, solve, output, exit code
plg_rt_stepi32 (ptr)step counter; 0 ⇒ uncatchable resource_error set
plg_rt_new_vari64 (ptr)fresh unbound REF cell
plg_rt_frame_alloci64 (ptr, i32)raw continuation-frame cells
plg_rt_frame_set/getvoid/i64 (ptr, i64, i32[, i64])frame slot access
plg_rt_areg_get/seti64/void (ptr, i32[, i64])argument registers
plg_rt_breg_setvoid (ptr, i32, i64)build registers for put_struct
plg_rt_put_structi64 (ptr, i32, i32)compound from breg[0..arity]
plg_rt_put_listi64 (ptr, i64, i64)cons cell
plg_rt_put_floati64 (ptr, i64)boxed f64 (bits)
plg_rt_put_bigi64 (ptr, i64)box an i64 beyond the i61 immediate (TAG_BIG)
plg_rt_unifyi32 (ptr, i64, i64)generic trail-recording unify
plg_rt_set_k / plg_rt_k_fn / plg_rt_k_envcontinuation register
plg_rt_push_cpvoid (ptr, i64, i64)choice point (retry fn + env)
plg_rt_cp_topi64 (ptr)choice-point height (cut barriers, commit slots)
plg_rt_cutvoid (ptr, i64)truncate cps to height (stops at CATCH frames — catch is opaque to cut)
plg_rt_derefi64 (ptr, i64)first-arg indexing dispatch
plg_rt_str_keyi64 (ptr, i64)STR → packed functor cell, else u64::MAX
plg_rt_pred_faili32 (ptr, i64)always-fail body (dynamic stubs)
plg_rt_existence_errori32 (ptr, i32, i32)sets v1-format error, returns 0

Inline comparison and arithmetic builtins

symbolsignature (IR)purpose
plg_rt_b_isi32 (ptr, i64, i64)is/2 (v1 eval semantics, exact error strings)
plg_rt_b_arith_cmpi32 (ptr, i32, i64, i64)op 0..5 = < > =< >= =:= =\=
plg_rt_b_neqi32 (ptr, i64, i64)\= (trail-rewinding attempt)
plg_rt_b_term_cmpi32 (ptr, i32, i64, i64)op 0..5 = == \== @< @> @=< @>=
plg_rt_b_comparei32 (ptr, i64, i64, i64)compare/3

Op-code tables live in codegen lower.rs and runtime control.rs and must stay in sync. Control constructs (;, ->, \+, once) have NO runtime symbols for compiled code — they compile to choice points and commit slots (body.rs); the runtime’s control.rs mirrors the same shapes only for goals built at runtime (queries, metacalls), walking goal terms, never clauses.

Errors, metacall, and control builtins

Structured errors. Machine::error carries a relocatable ball (copyterm::TermBuf) plus its pre-rendered v1-format message. Balls survive heap rewinding; solve::drive unwinds them to the nearest catch frame (a CpKind::Catch choice point whose env holds [catcher, recovery, k_fn, k_env]). plg_rt_cut stops at catch frames (catch is opaque to cut). The step-limit ball remains uncatchable.

Control builtins (the installed k is the continuation; the runtime walks goal TERMS only — never clauses):

symbolsignature (IR)purpose
plg_rt_metacalli32 (ptr, i64)dispatch a runtime goal term (var goals, call/N)
plg_rt_b_catch_3i32 (ptr, i64, i64, i64)push catch frame, run goal
plg_rt_b_throw_1i32 (ptr, i64)snapshot ball, raise
plg_rt_b_findall_3i32 (ptr, i64, i64, i64)bounded sub-search via the shared driver
plg_rt_pred_between_3i32 (ptr, i64)nondet builtin, uniform predicate signature (A-registers)

C-stack note: plg_rt_metacall / plg_rt_b_catch_3 / plg_rt_b_findall_3 hold one Rust frame while their goal runs, so nesting depth of those constructs (not Prolog recursion) consumes C stack — bounded by the step limit; full CPS compilation of metacalls is a named escape hatch.

Deterministic builtins plg_rt_b_<name>_<arity> — one symbol per entry in codegen’s DET_BUILTINS table (lower.rs), declarations generated from the same table. The runtime’s query-side dispatch (control.rs det_builtin) mirrors it; the differential corpus guards the pair. Vocabulary = exactly v1’s: type checks, functor/arg/univ/copy_term, atom/number conversions, msort/sort, succ/plus, unify_with_occurs_check, write/writeln/nl.

Stdlib. crates/compiler/stdlib.pl (v1’s file, verbatim) is parsed before user sources in every plgc build — member/append/length/last/ reverse/nth0/nth1 are ordinary compiled predicates in every binary, exactly like v1’s embedding.

Environment

PLG_MAX_STEPS overrides the default 10,000 step ceiling (documented extension over v1, which hardcoded it; the CLI contract is unchanged).