patch-prolog
A standalone Prolog compiler. plgc compiles an ISO-subset Prolog
program to a single native binary with zero runtime dependencies — no Rust
toolchain, no interpreter, no serialized clause database. Predicates become
native code via LLVM.
plgc build rules.pl -o my-linter # ~676K standalone binary
./my-linter --query "violation(Field, Reason)"
echo $? # 0 = no solutions (clean), 1 = solutions found
The compiled binary contains no clause interpreter — control flow is generated per predicate as native code, and only primitive services (heap, trail, unification, builtins, query parsing, output) come from a small runtime statically linked in. The result runs anywhere with libc/libm and nothing else.
The toolchain
Three binaries share the engine:
| Binary | What it is |
|---|---|
plgc | The compiler — build, run, and check your Prolog. |
plgr | An interactive REPL that drives the compiler (never interprets). |
plgl | A Language Server (diagnostics, completion, hover, go-to-definition). |
Where to go next
- Installation — install the toolchain and compile your first program.
- Compiler Usage —
plgccommands, the query wire-contract, and exit codes. - REPL Guide, Language Guide, and the Builtin & Stdlib Reference follow as those pages land.
Installation
patch-prolog is built from source. The toolchain is three binaries —
plgc (compiler), plgr (REPL), and plgl (language server) — installed
together.
Requirements
| To… | You need |
|---|---|
| build the toolchain | Rust (see rust-toolchain.toml) and just |
use plgc | clang ≥ 15 — it links the generated code. No Rust required. |
| run a compiled binary | nothing — only system libc/libm |
That last row is the point: the programs plgc produces are standalone.
You hand someone the binary and it runs, with no Prolog system installed.
Build and install
just install
This builds the runtime and compiler, then installs all three binaries
(plgc, plgr, plgl) with cargo install. To build in place without
installing:
just build # produces target/release/plgc
Shell completions for plgc:
plgc completions zsh > ~/.zfunc/_plgc # or: bash | fish | elvish | powershell
Your first program
Create deps.pl — a small build-dependency graph:
depends_on(app, auth).
depends_on(app, ui).
depends_on(auth, crypto).
depends_on(ui, render).
depends_on(render, crypto).
% needs/2 is the transitive closure of depends_on.
needs(X, Y) :- depends_on(X, Y).
needs(X, Y) :- depends_on(X, Z), needs(Z, Y).
Compile it to a native binary and query it — needs/2 walks the graph
transitively:
plgc build deps.pl -o deps
./deps --query "needs(app, X)" --format text
# X = auth
# X = ui
# X = crypto
# X = render
# X = crypto
Or skip the explicit build during development — plgc run compiles to a
temporary binary and executes it in one step (it still compiles; it never
interprets):
plgc run deps.pl --query "needs(app, X)"
To explore interactively, start the REPL:
plgr
plg> depends_on(app, auth).
defined. (1 in session)
plg> depends_on(app, ui).
defined. (2 in session)
plg> ?- depends_on(app, X).
X = auth ;
X = ui .
(At the ; prompt, press ; for the next solution or any other key to
stop.)
From here, see Compiler Usage for the full plgc
surface and the query wire-contract.
Compiler Usage
plgc has four commands:
plgc build Compile .pl sources to a native executable
plgc run Compile to a temp binary and run it immediately
plgc check Parse and statically check sources without compiling
plgc completions Generate shell completion scripts
Run plgc <command> --help for the exact flags.
plgc build
plgc build [OPTIONS] [INPUTS]...
Compiles one or more .pl files (concatenated in order) into a single
native binary.
| Flag | Effect |
|---|---|
-o, --output <PATH> | Output path (default: the stem of the first input). |
--debug | Build at -O0 with debug-friendly output and DWARF. |
--keep-ir | Keep the generated .ll LLVM IR next to the output, for inspection. |
--deny-undefined | Treat calls to undefined predicates as errors, not warnings (see below). |
--target <TARGET> | native (default), wasm32-wasi — a standalone .wasm CLI module (WASM Target) — or worker — a V8-isolate reactor for the edge (WASM Worker). |
plgc build rules.pl -o my-linter
plgc build base.pl rules.pl -o app # inputs concatenated in order
plgc build rules.pl --target wasm32-wasi -o rules.wasm # needs a wasm-enabled plgc
plgc build rules.pl --target worker # reactor .wasm + deploy glue (edge)
plgc run
plgc run [OPTIONS] --query <QUERY> [INPUTS]...
Compiles to a temporary binary and runs it immediately — the fast edit-test loop. It compiles every time; it never falls back to an interpreter, so what you test is what you ship.
| Flag | Effect |
|---|---|
--query <GOAL> | The goal to solve, e.g. "needs(app, X)". Required. |
--limit <N> | Maximum number of solutions to report. |
--format <json|text> | Output format. Default: text (human-readable). |
--deny-undefined | As for build. |
plgc run deps.pl --query "needs(app, X)"
plgc run deps.pl --query "needs(app, X)" --limit 1 --format json
plgc check
plgc check [OPTIONS] [INPUTS]...
Parses and statically checks sources without producing a binary — fast
feedback for editors and CI. Reports syntax errors and, by default,
warns about calls to predicates that are defined nowhere. --deny-undefined
turns those warnings into a non-zero exit.
Querying a compiled binary
A binary built by plgc answers queries directly — this is the same
contract plgc run and the REPL use under the hood:
./my-binary --query "goal(X)" [--limit N] [--format json|text]
--format textprints one variable binding per line (X = bob), the form you read in a terminal.--format jsonprints a single structured object — and is the binary’s default, because the typical deployment is a program calling the binary and parsing the result:
{"count":2,"exhausted":true,"solutions":[{"X":"bob"},{"X":"ann"}]}
exhausted is false when more solutions exist beyond --limit.
Exit codes
The exit code is the answer for shell pipelines — you often don’t need to read stdout at all:
| Code | Meaning |
|---|---|
0 | No solutions (the goal failed). |
1 | One or more solutions found. |
2 | The --query could not be parsed. |
3 | A runtime error (e.g. an uncaught exception, the step limit). |
if ./my-linter --query "violation(_, _)" >/dev/null; then
echo "lint violations found"; exit 1
fi
Undefined predicates
Because patch-prolog compiles the whole program, it knows every defined
predicate at build time. A direct call to a predicate that is defined
nowhere can never succeed, so build, run, and check warn about it
by default (and plgl shows a squiggle). This stays ISO-faithful: such a
call still compiles and raises a catchable existence_error if reached.
Pass --deny-undefined to promote the warning to a hard error (no binary,
non-zero exit) when you want correctness enforced over leniency.
Script mode
A .pl file with a plgc shebang behaves as its own compiled binary:
plgc compiles it to a temp binary and execs it, passing along the
arguments after the file. You invoke it with the same --query contract as
any compiled binary.
#!/usr/bin/env plgc
main :- write(hello), nl.
chmod +x hello.pl
./hello.pl --query "main"
# hello
WASM Target
plgc can compile a Prolog program to a standalone wasm32-wasi module
instead of a native binary. The module preserves the exact --query wire
contract, so anything that drives a compiled binary drives the .wasm
identically — only the runner changes (a WASI engine instead of the OS).
This is Tier 1: a standalone WASI command that runs on wasmtime,
wasmer, Wasmtime-based platforms (Spin, Fermyon), and any WASI runtime. The
deployed .wasm needs only a wasm engine — no Rust, no clang, no plgc.
Requirements
The build side needs a wasm-capable plgc plus the Rust toolchain’s LLVM
tools (no wasi-sdk required):
rustup target add wasm32-wasip1 # std + bundled wasi-libc
rustup component add llvm-tools-preview # llc + wasm-ld
Install a wasm-capable plgc (this embeds the wasm runtime archive):
just install-wasm
# equivalently:
# just build-runtime-wasm
# cargo install --path crates/compiler --features wasm --force
The default cargo install / just install is unchanged — wasm support is
opt-in at build time and adds nothing to a native-only plgc.
To run a module you need a WASI engine with the tail-call proposal —
recent wasmtime (≥ ~0.40), recent wasmer, or any current WASI runtime
(brew install wasmtime, or see wasmtime.dev). The constant-stack guarantee
relies on wasm return_call; an engine without tail-call support rejects the
module at instantiation with a clear tail-call error.
Usage
plgc build --target wasm32-wasi deps.pl -o deps.wasm
wasmtime run deps.wasm --query "needs(app, X)" --format json
--query, --limit, --format, the JSON shape, and the exit codes
(0 none · 1 solutions · 2 parse error · 3 runtime error) are all
identical to a native build. If -o is omitted, the output defaults to the
input stem with a .wasm extension.
How it works
The same LLVM IR plgc emits for native is retargeted to wasm32-wasi:
llc -mattr=+tail-calllowers the engine’smusttailcalls to wasmreturn_call/return_call_indirect, so Prolog recursion and backtracking run in constant stack exactly as they do natively. If the tail-call feature is ever missing,llcfails at build time — it never silently emits a stack-growing call.wasm-ldlinks the program object, thewasm32-wasip1runtime archive, and the target’s self-contained wasi-libc into a WASI command module.
The whole toolchain cost lands on the build side. The artifact you ship is as self-contained as a native binary — it answers queries with only a wasm engine present.
Tuning
The metacall depth bound (PLG_METACALL_DEPTH) and step limit
(PLG_MAX_STEPS) work the same under WASI as natively — but WASI does not
inherit the host environment, so pass them explicitly to the engine:
wasmtime run --env PLG_MAX_STEPS=100000000 prog.wasm --query "..."
A wasm engine’s default stack is smaller than a native one (~1 MB vs ~8 MB), so
for programs that lean on deeply nested runtime-walked metacalls you may want
a lower PLG_METACALL_DEPTH. Ordinary recursion and call/1 tail recursion are
constant-stack and unaffected (a 1,000,000-deep call/1 runs under a 1 MB wasm
stack).
Wasm builds optimize at LLVM -O2 (--debug drops to -O0); the -O3 the
native path uses buys little for this IR.
Edge / serverless (Tier 2)
Running inside V8 isolates (Cloudflare Workers and similar) needs a different
build than Tier 1 — wasm32-unknown-unknown with no WASI and a host-call entry
instead of a CLI. That is Tier 2, available via --target worker: see the
WASM Worker guide for usage, the buffer ABI, tuning, and a
Cloudflare deployment tutorial. Tier 1 (this page) covers WASI runtimes;
Tier 2 covers the edge.
WASM Worker (edge / V8 isolates)
plgc can compile a Prolog program to a reactor wasm module that runs
inside a V8 isolate — Cloudflare Workers, workerd, or Node — instead of a
process. This is Tier 2: there is no main, no argv, no stdio. The module
exports a small buffer ABI and a JS host calls it over linear memory; a query
becomes a function call inside a warm, globally-replicated isolate.
This is the scale-to-zero / global edge shape. The compiled artifact is the same no-VM, no-interpreter engine as a native build — small enough to clear the single-digit-MB module limits edge isolates impose, and stateless by design, so it fits the isolate model by construction. For WASI runtimes (wasmtime, Spin, …), use Tier 1 instead.
The answers are byte-identical to a native build, including error text — the reactor runs the same engine over the same JSON wire shape, only the transport differs (a memory buffer instead of stdout).
Requirements
Build side — a wasm-capable plgc plus the Rust LLVM tools (no wasi-sdk):
rustup target add wasm32-unknown-unknown # the reactor target
rustup component add llvm-tools-preview # llc + wasm-ld
just install-wasm # plgc with both wasm archives embedded
just install-wasm builds and embeds both wasm runtime archives (Tier 1
wasm32-wasip1 and Tier 2 wasm32-unknown-unknown) behind the one wasm
feature, so a single install covers both targets. The default cargo install /
just install is unchanged — wasm support is opt-in at build time.
Run side — anything with a V8 isolate and wasm tail calls:
- Cloudflare Workers (deploy with
wrangler), or localworkerd(npm i -g workerd) for the dev loop, or - Node ≥ 20 for scripting/testing the module directly.
The constant-stack guarantee relies on wasm return_call; V8 ships tail calls
and the module is rejected at instantiation by any engine that lacks them — never
a silent stack overflow.
Usage
plgc build --target worker deps.pl
This emits deps.worker.wasm plus four overrideable scaffolding files next
to it:
| File | Role |
|---|---|
deps.worker.wasm | the reactor module (the real artifact, always regenerated) |
reactor.mjs | the buffer-ABI marshalling (runQuery / assertExports) |
worker.js | a Cloudflare/workerd fetch handler that calls reactor.mjs |
wrangler.toml | Cloudflare deploy config |
config.capnp | local workerd serve config |
The glue files are written only if absent, so a rebuild refreshes the
.wasm but never clobbers your edits — delete one to regenerate it. --target wasm32-unknown-unknown is accepted as a synonym for worker. If -o is
omitted the output defaults to the input stem with a .worker.wasm extension
(distinct from Tier 1’s .wasm, so building both targets doesn’t overwrite).
Run it locally
just wasm-worker-serve examples/deps.pl
# compiles + serves on workerd, then:
curl 'http://localhost:8080/?query=needs(app, X)'
# {"count":5,"exhausted":true,"output":"","solutions":[{"X":"auth"},…]}
The handler reads the goal from ?query= or, failing that, the POST body:
curl -X POST --data 'depends_on(app, D)' http://localhost:8080/
Response shape
The JSON is the v1 wire shape with one Tier-2 addition — an output field
carrying anything the program wrote with write/1/writeln/1/nl/0 (an
isolate has no stdout, so output is captured losslessly rather than lost):
{"count":1,"exhausted":true,"output":"","solutions":[{"X":"auth"}]}
The field is always present (empty when nothing was written), so a host can rely
on its shape. Strip it to get bytes identical to a native --query --format json. Errors render exactly as native, as {"error":"…"}.
The buffer ABI
If you replace the glue, the contract the module exports is:
plg_init() build the Machine once per isolate
plg_rt_alloc(len) -> ptr host writes the query bytes here
plg_rt_run_query(ptr, len, limit, steps, depth) -> (len<<32)|ptr JSON result
plg_rt_free(ptr, len) host frees the query / result buffer
reactor.mjs is the single source of this dance and is imported by both the
deployed worker.js and the test harness, so the two can’t drift. Free every
buffer by its requested length (the allocator is length-keyed), and read the
result view after plg_rt_run_query returns — solving can grow linear memory
and detach an earlier view.
Concurrency contract
One in-flight query per isolate. The module keeps a single program Machine;
a V8 isolate is single-threaded, but a Worker can interleave async tasks, so the
host must not start a second query before the first returns. The emitted
worker.js satisfies this automatically: its runQuery is fully synchronous
(the only await is reading the POST body, before the ABI call), so
concurrent fetch invocations serialize on the event loop.
Tuning
plg_rt_run_query’s last three arguments bound a query before the platform’s
own CPU/wall limit cuts it off — they mirror the native knobs:
| Arg | Native equivalent | 0 means |
|---|---|---|
limit | --limit | unbounded solutions |
steps | PLG_MAX_STEPS | module default (10,000) |
depth | PLG_METACALL_DEPTH | module default (1,000) |
The emitted worker.js calls runQuery(reactor(), query) with the module
defaults. To raise the step ceiling for heavier programs, pass options —
runQuery(reactor(), query, { stepLimit: 100_000_000n }) (the step limit is an
i64, hence the BigInt). Depth matters more here than natively: a wasm
isolate’s stack (~1 MB) is far smaller than a native one (~8 MB). Ordinary
recursion and call/1 tail recursion are constant-stack regardless — a
1,000,000-deep call/1 returns in a V8 isolate via return_call.
Deploy to Cloudflare
The emitted wrangler.toml makes the module a deployable Worker. From the
directory holding the build output:
plgc build --target worker deps.pl # emits deps.worker.wasm + glue
npx wrangler deploy # uploads the Worker + wasm module
wrangler bundles worker.js and reactor.mjs and uploads the .wasm as a
compiled module (the CompiledWasm rule in wrangler.toml). Then query the
edge:
curl 'https://deps.<your-subdomain>.workers.dev/?query=needs(app, X)'
The isolate cold-starts once (atom table + registry built at plg_init), then
every request is a warm in-memory call — no process fork, no per-request parse
of the program. Edit the generated name/compatibility_date in
wrangler.toml and the routing/limits in worker.js to taste; both are
scaffolding, not regenerated.
Footprint
A reactor module is a few MB (the deps example is ~1.7 MB raw, well within the
Workers budget and compressing further). Large fact tables inflate the module’s
.rodata; a program whose data pushes past the isolate size cap belongs on Tier
1 / containers instead. The toolchain cost is entirely build-side — the deployed
.wasm runs with only a wasm engine present.
How it works
The reactor reuses the same LLVM IR as native and Tier 1, retargeted to
wasm32-unknown-unknown:
- The module emits no
main— instead an exportedplg_initbuilds the Machine and hands it to the runtime; the host drives the buffer ABI exports. llc -mattr=+tail-calllowers the engine’smusttailchains toreturn_call/return_call_indirect, so recursion and backtracking run in constant stack inside the isolate. A missing tail-call feature is a loud build/load error, never a silent overflow.wasm-ld --no-entrylinks the program object against thewasm32-unknown-unknownruntime archive with no crt and no libc — the reactor touches no WASI, so the resulting module imports nothing and instantiates in a bare isolate.
Language Guide
patch-prolog implements an ISO subset of Prolog. A program is a set of facts and rules; you pose queries and the engine searches for proofs, binding variables along the way. This guide teaches the language as this engine implements it — including the places where it makes a deliberate, documented choice. For the complete operator table see the Operators reference; for every builtin and stdlib predicate, the Builtin & Stdlib Reference.
One thing to keep in mind throughout: this is a compiler. Your whole
program is known at build time, so there is no assert/retract — the
knowledge base is immutable. You change a program by editing its source and
recompiling (the REPL’s :edit automates that loop).
Terms
Everything is a term. There are only a few kinds:
| Kind | Examples |
|---|---|
| Atom | tom, [], 'a quoted atom', + |
| Number | 42, -7, 3.14 |
| Variable | X, Dep, _, _Rest |
| Compound | depends_on(app, auth), point(1, 2), -(3) |
A variable starts with an uppercase letter or _; everything else
starting lowercase (or quoted) is an atom. This is the single most common
beginner trip-up: Dep is a variable, depends_on is a predicate name.
A list is sugar for nested compounds: [a, b, c] is '.'(a, '.'(b, '.'(c, []))), and [] is the empty-list atom. The [H|T] notation splits
a list into its head and tail.
Operators are sugar too: 2 + 3 is the compound +(2, 3), H = [X|T] is
=(H, '.'(X, T)). Unification and resolution always see the compound form.
Facts, rules, and clauses
A fact asserts something true:
depends_on(app, auth).
depends_on(auth, crypto).
A rule says a head is true if its body is:
needs(X, Y) :- depends_on(X, Y).
needs(X, Y) :- depends_on(X, Z), needs(Z, Y).
Read :- as “if” and , as “and.” A predicate is identified by name and
arity (needs/2), and its clauses are tried top to bottom. Multiple
clauses are alternatives — here, needs is a direct dependency or a
dependency reached through another.
A directive runs at load/build time:
:- dynamic(extra/1).
Queries and the search
A query asks the engine to prove a goal:
./deps --query "needs(app, X)"
The engine searches for variable bindings that make the goal true, yielding
solutions one at a time. With depends_on/2 and needs/2 above,
needs(app, X) yields X = auth then X = crypto. If a goal can be proved
more than one way, backtracking explores each alternative; if it can’t
be proved at all, the query fails (no solutions).
Unification
=/2 makes two terms equal by binding variables:
?- point(X, 2) = point(1, Y).
% X = 1, Y = 2
No occurs check (by design).
X = f(X)succeeds, producing a cyclic term — matching ISO’s default. Useunify_with_occurs_check/2when you need the safe version that fails instead.
Related: \=/2 (not unifiable), and ==/2 / \==/2, which test
structural identity without binding — X == Y is false for two distinct
unbound variables, where X = Y would succeed.
Backtracking
When a goal has several solutions, the engine produces them lazily:
likes(sam, pizza).
likes(sam, sushi).
likes(ann, pizza).
?- likes(sam, Food).
Food = pizza ;
Food = sushi .
Composed goals backtrack jointly — likes(P, pizza), likes(P, sushi)
searches for a P satisfying both.
Control constructs
| Form | Meaning |
|---|---|
A, B | Conjunction — prove A, then B. |
(A ; B) | Disjunction — A, or B on backtracking. |
(Cond -> Then ; Else) | If-then-else — if Cond succeeds, commit and prove Then, else Else. |
\+ Goal | Negation as failure — succeeds iff Goal cannot be proved. |
once(Goal) | The first solution of Goal only. |
call(Goal) / call(G, Extra...) | Call a goal built at runtime. |
The cut ! commits to the choices made so far in the current clause,
discarding remaining alternatives. patch-prolog follows ISO: the cut is
transparent through ,, ;, and ->. So with m(1). m(2). m(3).:
cut_demo(X) :- (m(X), X > 1, ! ; X = fallback).
?- cut_demo(X). yields exactly X = 2 — the cut prunes both the rest of
m/1 and the ; X = fallback branch. (The cut is opaque inside
\+, once, the condition of ->, and catch/3.)
Arithmetic
Arithmetic is explicit: is/2 evaluates its right side, and the comparison
operators evaluate both sides.
?- X is 2 + 3 * 4. % X = 14
?- 6 =:= 2 * 3. % true (=:= compares values; = would unify terms)
Division has several forms, and the defaults follow ISO — worth memorizing:
| Op | Result | Example |
|---|---|---|
/ | always a float (ISO 9.1.4) | 10 / 3 → 3.333… |
// | integer, truncates toward zero | -7 // 2 → -3 |
div | integer, floors toward −∞ | -7 div 2 → -4 |
mod | remainder, sign of divisor | -7 mod 3 → 2 |
rem | remainder, sign of dividend | -7 rem 3 → -1 |
Other evaluable functions include - abs/1 min/2 max/2 sign/1, the
bitwise operators (/\ \/ xor << >> \), and powers (** float, ^
integer-or-float). Use =:= =\= < > =< >= to compare.
Arithmetic is safe, not silent. Integer overflow raises
evaluation_error(int_overflow)— it never wraps around. ANaN/infinite float result, or division by zero, is likewise a runtime error.
Lists
Lists use [H|T] to split head from tail, and the stdlib (compiled into
every binary) provides the staples:
?- append([1,2], [3,4], L). % L = [1,2,3,4]
?- member(X, [a,b,c]). % X = a ; X = b ; X = c
?- length([a,b,c], N). % N = 3
Also available: reverse/2, last/2, nth0/3, nth1/3. To collect
every solution of a goal into a list, use findall/3; to enumerate a
range of integers, between/3:
?- findall(D, needs(app, D), Ds). % Ds = [auth, crypto]
?- between(1, 3, X). % X = 1 ; X = 2 ; X = 3
Inspecting and building terms
| Predicate | Purpose |
|---|---|
functor(T, Name, Arity) | Term’s functor and arity, or build one. |
arg(N, T, A) | The Nth argument of a compound. |
T =.. List | “Univ” — decompose/build a term as `[Functor |
copy_term(T, Copy) | A fresh copy with renamed variables. |
Type tests classify a term: var/1, nonvar/1, atom/1, number/1,
integer/1, float/1, compound/1, is_list/1.
Standard order of terms
Terms have a total order — Variables < Numbers < Atoms < Compounds (and
a float sorts just before an equal-valued integer). Compare with @<,
@>, @=<, @>= or compare/3, and sort with sort/2 (dedupes) or
msort/2 (keeps duplicates).
Exceptions
throw/1 raises a term; catch(Goal, Catcher, Recovery) runs Recovery if
Goal throws something that unifies with Catcher. Errors use the ISO
taxonomy — error(Formal, Context) where Formal is type_error/2,
existence_error/2, evaluation_error/1, instantiation_error, and so on.
Calling a predicate that is defined nowhere raises
existence_error(procedure, Name/Arity) — catchable, per ISO. Because the
compiler sees the whole program, it also warns about such calls at build
time (see Compiler Usage and --deny-undefined).
What this subset leaves out
These are deliberate omissions, not gaps to be filled:
- No
assert/retract. The knowledge base is immutable — compiled at build time.:- dynamic(F/A)only makes an undefined predicate fail silently instead of raisingexistence_error; it does not enable runtime clause mutation. - No
op/3. The operator table is fixed (see Operators). - No postfix operators, no module system, no DCG, no
bagof/setof.
Safety guarantees
Every compiled program is bounded: a runaway computation hits a step
limit and stops with resource_error(steps) — deliberately uncatchable,
so a buggy clause can’t trap its own timeout. Combined with checked
arithmetic and no runtime file I/O, a compiled binary is safe to run on
untrusted input.
Operators
patch-prolog supports the standard prefix and infix operators of ISO Prolog. Postfix is not supported — see Postfix is unsupported, by design below.
op/3 (user-defined operators) is also not supported. The operator table below
is the complete fixed set the engine recognizes.
Canonical form is always available
Every operator term is sugar for a compound. Anything you can write with an operator, you can write in functor notation:
| Operator form | Canonical form |
|---|---|
2 + 3 | +(2, 3) |
- a | -(a) |
\+ Goal | \+(Goal) |
H = [X|T] | =(H, '.'(X, T)) |
X is 6 /\ 3 | is(X, /\(6, 3)) |
The arithmetic evaluator, unification, and clause resolution see the compound form — operator syntax is purely a reader-level convenience. If you ever hit a parse-precedence surprise, the canonical form will always parse unambiguously.
Prefix operators
| Op | Type | Prec | Meaning |
|---|---|---|---|
- | fy | 200 | Arithmetic negation. Folds for literal numbers: -3 reads as the integer -3, not the compound -(3). |
+ | fy | 200 | Identity / explicit positive sign. Folds for literal numbers: +3 reads as 3. For non-literals, builds +(X). |
\ | fy | 200 | Bitwise complement (evaluable in is/2). No folding — \3 reads as the compound \(3). |
\+ | fy | 900 | Negation as failure (control construct). |
! (cut) is a 0-arity atom, not a prefix operator — it does not take an argument.
The clause and directive markers :- and ?- are recognized by the parser but
are not user-redefinable operators:
Head :- Body.— clause separator (infix at the top level of a program).:- Directive.— directive prefix (e.g.:- dynamic(foo/1).).?- Goal.— query prefix (optional, accepted by the query reader).
Infix operators
Listed from loosest binding to tightest. Associativity uses the standard
fixity codes (xfx non-associative, xfy right-associative, yfx
left-associative).
| Op | Type | Prec | Meaning |
|---|---|---|---|
; | xfy | 1100 | Disjunction (also the if-then-else outer in (C -> T ; E)). |
-> | xfy | 1050 | If-then. Only meaningful inside a parenthesized goal expression. |
, | xfy | 1000 | Conjunction. |
= | xfx | 700 | Unification. |
\= | xfx | 700 | Not-unifiable. |
== | xfx | 700 | Term identity (no unification). |
\== | xfx | 700 | Term non-identity. |
is | xfx | 700 | Arithmetic evaluation. |
< | xfx | 700 | Arithmetic less-than. |
> | xfx | 700 | Arithmetic greater-than. |
=< | xfx | 700 | Arithmetic less-or-equal. (Note: =<, not <=.) |
>= | xfx | 700 | Arithmetic greater-or-equal. |
=:= | xfx | 700 | Arithmetic equality. |
=\= | xfx | 700 | Arithmetic inequality. |
@< | xfx | 700 | Standard term ordering: less. |
@> | xfx | 700 | Standard term ordering: greater. |
@=< | xfx | 700 | Standard term ordering: less-or-equal. |
@>= | xfx | 700 | Standard term ordering: greater-or-equal. |
=.. | xfx | 700 | Univ: term ↔ list decomposition. |
+ | yfx | 500 | Addition. |
- | yfx | 500 | Subtraction. |
/\ | yfx | 500 | Bitwise AND. |
\/ | yfx | 500 | Bitwise OR. |
xor | yfx | 500 | Bitwise XOR. |
* | yfx | 400 | Multiplication. |
/ | yfx | 400 | Float division (ISO 9.1.4 — always returns a float, even for integer operands). |
// | yfx | 400 | Integer division (truncates toward zero). |
mod | yfx | 400 | ISO modulo (sign follows divisor). |
rem | yfx | 400 | ISO remainder (sign follows dividend). |
div | yfx | 400 | Floor division (rounds toward −∞; differs from // when signs disagree). |
<< | yfx | 400 | Shift left (integer). |
>> | yfx | 400 | Shift right (integer). |
** | xfx | 200 | Float power (always returns a float). |
^ | xfy | 200 | Integer power for Int ^ Int with non-negative exponent; float otherwise. |
: | xfy | 200 | Module qualifier (parser-only; no module system, but the term parses and round-trips). |
Postfix is unsupported, by design
patch-prolog recognizes no postfix operators and none are planned.
Three reasons:
- Postfix is pure syntactic sugar — zero expressive power. A term written
with a hypothetical postfix
yfoperator (X yf) is exactly the compoundyf(X). The form unification and resolution see is identical to what functor notation already gives you. - Standard Prolog ships no built-in postfix operators. ISO 13211-1
defines the postfix types (
xf,yf) so a program could declare one viaop/3, but the standard operator table itself contains zero postfix entries. op/3is not supported. Withoutop/3there is no path to a user-defined operator of any fixity, so the choice is moot anyway.
Teaching materials sometimes present postfix alongside prefix and infix for symmetry. Treat that as a description of the language family; this engine is a deliberate subset.
Known quirks
- Literal folding.
-3and+3read as the integer literals-3and3, not the compounds-(3)and+(3). To force the compound form (e.g. when pattern-matching withfunctor/3), use a non-numeric operand:- a,+ X.\3is not folded — it parses as\(3). !is an atom, not an operator.! Xis a syntax error; if you want cut followed byX, write!, X.->outside parens. The if-then operator is only meaningful inside a parenthesized goal expression.( Cond -> Then ; Else )is the idiom; bareCond -> Thenat clause-body level may not parse the way you expect./is always float. Per ISO 9.1.4,10 / 3evaluates to3.3333..., not3. Use//(truncating) ordiv(floor) for integer division.=<vs<=. The Prolog convention is=<. Writing<=is a parse error.
See also
- Language Guide — how these operators are used (terms, arithmetic, control, unification).
- Builtin & Stdlib Reference — the predicates the comparison, arithmetic, and term-order operators name.
- ISO/IEC 13211-1:1995, §6.3.4 (operator notation) and §8.7 (arithmetic).
Builtin & Stdlib Reference
Every predicate the engine provides. The builtins are compiled into the
language itself; the standard library is a small set of list predicates
compiled into every binary. The descriptions here mirror the engine’s
builtin table (plg-shared::BUILTINS) verbatim — a test fails the build if
this page and the table disagree.
Evaluable arithmetic functions (+, *, mod, abs, …) are not
predicates; see the Language Guide and the
Operators reference for those and for operator precedence.
Type checks
| Predicate | Description |
|---|---|
var/1 | Type check: succeeds if argument is an unbound variable. |
nonvar/1 | Type check: succeeds if argument is bound. |
atom/1 | Type check: succeeds if argument is an atom. |
number/1 | Type check: succeeds if argument is an integer or float. |
integer/1 | Type check: succeeds if argument is an integer. |
float/1 | Type check: succeeds if argument is a float. |
compound/1 | Type check: succeeds if argument is a compound term. |
is_list/1 | Type check: succeeds if argument is a proper list. |
Unification & term comparison
| Predicate | Description |
|---|---|
=/2 | Unification: X = Y succeeds if X and Y can be made identical. |
\=/2 | Not-unifiable: succeeds when = would fail. |
==/2 | Term identity: structural equality without unification. |
\==/2 | Term non-identity. |
unify_with_occurs_check/2 | Unification with occurs check: rejects X = f(X)-style cycles. |
compare/3 | compare(Order, T1, T2) — bind Order to <, =, or > per standard term ordering. |
@</2 | Standard term ordering: less. |
@>/2 | Standard term ordering: greater. |
@=</2 | Standard term ordering: less-or-equal. |
@>=/2 | Standard term ordering: greater-or-equal. |
Arithmetic
| Predicate | Description |
|---|---|
is/2 | Arithmetic evaluation: X is Expr binds X to the value of Expr. |
=:=/2 | Arithmetic equality. |
=\=/2 | Arithmetic inequality. |
</2 | Arithmetic less-than. |
>/2 | Arithmetic greater-than. |
=</2 | Arithmetic less-or-equal (note: =<, not <=). |
>=/2 | Arithmetic greater-or-equal. |
Control
| Predicate | Description |
|---|---|
,/2 | (A, B) — conjunction: prove A, then B. |
;/2 | (A ; B) — disjunction: prove A, or B on backtracking. (C -> T ; E) reads as if-then-else. |
->/2 | (C -> T) — if-then: if C succeeds (committing to its first solution), prove T; otherwise fail. |
\+/1 | Negation as failure: succeeds when its argument fails. |
once/1 | once(Goal) — succeed at most once for Goal. |
call/1 | Meta-call: execute its argument as a goal. Variadic — extra args are appended. |
true/0 | Always succeeds. |
fail/0 | Always fails. |
false/0 | Always fails (alias for fail). |
!/0 | Cut: commit to current choices; remove choice points back to the parent clause. |
Finding solutions & enumeration
| Predicate | Description |
|---|---|
findall/3 | findall(Template, Goal, List) — collect all solutions of Goal. |
between/3 | between(Low, High, X) — enumerate or test integers in [Low, High]. |
Exceptions
| Predicate | Description |
|---|---|
catch/3 | catch(Goal, Catcher, Recovery) — run Goal; on thrown error matching Catcher, run Recovery. |
throw/1 | Raise an error term that propagates to the nearest matching catch/3. |
Term construction & inspection
| Predicate | Description |
|---|---|
functor/3 | functor(Term, Name, Arity) — inspect or construct a term’s functor. |
arg/3 | arg(N, Term, Arg) — extract the N-th argument of Term. |
=../2 | Univ: T =.. L decomposes T into a list of its functor and args. |
copy_term/2 | copy_term(T, C) — bind C to a copy of T with fresh variables. |
Atoms & numbers
| Predicate | Description |
|---|---|
atom_length/2 | atom_length(A, L) — bind L to the length of atom A. |
atom_concat/3 | atom_concat(A, B, C) — concatenate A and B into C, or, with C bound and A/B unbound, nondeterministically split C into every prefix/suffix pair. |
atom_chars/2 | atom_chars(A, Chars) — convert between an atom and a list of single-char atoms. |
number_chars/2 | number_chars(N, Chars) — convert between a number and a list of single-char atoms. |
number_codes/2 | number_codes(N, Codes) — convert between a number and a list of character codes. |
Arithmetic relations
| Predicate | Description |
|---|---|
succ/2 | succ(X, S) — Peano successor relation; S = X + 1, both non-negative. |
plus/3 | plus(X, Y, Z) — addition relation; any one argument may be unbound. |
Sorting
| Predicate | Description |
|---|---|
msort/2 | msort(L, Sorted) — sort without removing duplicates. |
sort/2 | sort(L, Sorted) — sort and remove duplicates. |
I/O
| Predicate | Description |
|---|---|
write/1 | Write a term to stdout (no newline). |
writeq/1 | Write a term to stdout, quoting atoms so it reads back (no newline). |
writeln/1 | Write a term to stdout followed by a newline. |
nl/0 | Write a newline to stdout. |
Standard library (lists)
Defined in Prolog (stdlib.pl) and compiled into every binary.
| Predicate | Description |
|---|---|
member/2 | member(X, List) — succeeds once for each element of List. |
append/3 | append(A, B, C) — C is A concatenated with B (relational; works in reverse). |
length/2 | length(List, N) — N is the number of elements in List. |
last/2 | last(List, X) — X is the last element of List. |
reverse/2 | reverse(List, Rev) — Rev is List in reverse order. |
nth0/3 | nth0(N, List, X) — X is the element at zero-based index N. |
nth1/3 | nth1(N, List, X) — X is the element at one-based index N. |
Semantics & ISO Conformance
patch-prolog implements an ISO subset. This page is the precise statement of how it behaves where the standard leaves room for choices, the few deliberate divergences, and the safety guarantees layered on top. The Language Guide teaches these with examples; this is the reference, with ISO section numbers.
Unification (ISO 8.3.2)
=/2does not perform the occurs check —X = f(X)succeeds, creating a cyclic term, which is the ISO default. Solution display handles cycles safely.unify_with_occurs_check/2is the safe variant: it fails onX = f(X)-style cycles instead of building them.
Float equality
- Unification treats floats by bit pattern, so
NaNunifies withNaN. - This differs from arithmetic comparison, where any comparison involving
NaNis false (=:=,<, etc.).
Arithmetic
moduses ISO floored semantics — the result takes the sign of the divisor (-7 mod 3is2), not the truncated remainder. (remtakes the sign of the dividend.)- Integer overflow is a runtime error (
evaluation_error(int_overflow)), never a silent wraparound. - A
NaNor infinite float result from any operation is a runtime error. - Division by zero (integer or float) is a runtime error.
Standard order of terms (ISO 8.4.2)
Variables < Numbers < Atoms < Compounds
- A float sorts just before an integer of equal value (
1.0 @< 1). NaNsorts after all other floats (a deterministic total order).
Built-in error behavior
number_chars/2andnumber_codes/2raise a syntax error on a non-numeric string (not silent failure), and rejectNaN/infinity.T =.. [F]withFunbound raises an instantiation error (ISO 8.5.3); an empty list is an error, not a failure.functor/3with a negative arity is an error.atom_chars/2is for atoms only;number_chars/2handles numbers.- When
is/2(or an arithmetic comparison) meets a non-evaluable functor, thetype_error(evaluable, Culprit)culprit is a predicate indicatorName/Arity(ISO 8.6) — so a bare atomfooreportsfoo/0, exactly like the compound casefoo(1)reportsfoo/1. (The original v1 interpreter reported the bare atom for the arity-0 case.)
Exceptions
catch/3andthrow/1(ISO 7.8.9, 7.8.10) use the ISO error taxonomy:error(Formal, Context), whereFormalis one ofinstantiation_error,type_error/2,existence_error/2,domain_error/2,evaluation_error/1,permission_error/3,representation_error/1,resource_error/1, orsyntax_error.throw/1of an unbound variable raisesinstantiation_error.catch/3is opaque to cut: a!inside the goal cannot escape the catch frame.
Cut (ISO 7.8.4)
! is transparent through ,, ;, and ->: a cut inside a
disjunction branch cuts the whole clause, including the other branch and
the predicate’s remaining clauses. With m(1). m(2). m(3).:
t(X) :- (m(X), X > 1, ! ; X = fallback).
?- t(X). yields exactly X = 2. Cut remains opaque inside \+/1,
once/1, the condition of ->, and catch/3.
If you used the original patch-prolog interpreter, note this is a change: it treated the cut as local to the disjunction branch (and also produced
X = fallback). patch-prolog follows ISO here, matching SWI and GNU Prolog.
Dynamic and undefined predicates (ISO 7.7.3)
:- dynamic(F/A).declares a predicate whose clauses may be absent. An undefined dynamic predicate fails silently; an undefined non-dynamic predicate raisesexistence_error(procedure, F/A).- A direct call to a predicate defined nowhere is still a well-formed
program: it compiles and raises that (catchable)
existence_errorif reached. Because patch-prolog is a whole-program compiler, such a call can never succeed, soplgc build/run/checkwarn about it andplglshows a squiggle.--deny-undefinedpromotes the warning to a hard error — opt-in strictness above the ISO-compliant default, not a divergence from it. See Compiler Usage.
Safety guarantees (beyond ISO)
- A runaway computation hits a step limit and stops with
resource_error(steps). This error is deliberately uncatchable, so a buggy clause cannot trap its own timeout. This is a safety guarantee, not an ISO requirement.
Not implemented
These are deliberate omissions of the subset:
assert/1,retract/1— the knowledge base is immutable (compiled at build time);:- dynamiconly enables silent-fail, not runtime mutation.op/3— the operator table is fixed; see Operators.- Module system.
- Definite clause grammars (DCG).
REPL Guide
plgr is an interactive Prolog loop — define clauses, run queries, walk
solutions — built on the same compiler as everything else. It does not
interpret: every program change is compiled to a fresh native binary, and
queries run against that binary. What you test in the REPL is exactly what
plgc build would ship.
Because it drives the compiler, plgr needs plgc on your PATH (or set
$PLGC to its location). Start it with no arguments:
plgr
plgr — patch-prolog REPL. :help for commands, :quit to exit.
plg> ▮
Two kinds of input
The prompt is plg> — deliberately neutral, because it accepts two
different things:
- A clause, rule, or directive — anything ending in
.that isn’t a query. This edits your program and recompiles. - A query —
?- Goal.This runs against the current program and shows solutions. It does not change anything.
You type the ?- yourself; that’s what tells the REPL “run this” rather
than “remember this.”
plg> depends_on(app, auth).
defined. (1 in session)
plg> ?- depends_on(app, X).
X = auth .
(Plus :-prefixed commands, covered below.)
Defining your program
Type facts, rules, and directives as you would write them in a file:
plg> depends_on(app, ui).
defined. (2 in session)
plg> needs(X, Y) :- depends_on(X, Y).
defined. (3 in session)
plg> :- dynamic(extra/1).
defined. (4 in session)
Each accepted entry recompiles the program and confirms with the running
clause count. A rule spanning several lines continues under the |
prompt until a line ends in .:
plg> needs(X, Y) :-
| depends_on(X, Z),
| needs(Z, Y).
defined. (5 in session)
Multiple clauses for the same predicate accumulate as alternatives — the
knowledge base is immutable, so there is no in-place “redefine.” To change
or remove a clause, edit the whole session with :edit.
Capitalization matters. A name starting with a capital letter is a variable, so
Needs(app, X) :- ...is not a clause — the REPL says so and suggestsneeds. Predicate and atom names start lowercase.
Running queries
A ?- Goal. query runs against the current program and reveals solutions
one at a time, SWI-style:
plg> ?- depends_on(app, X).
X = auth ;
The trailing ; means “there may be more.” Press ; or space to
see the next solution; press any other key (Enter, ., …) to stop. The
last solution ends with .:
X = auth ;
X = ui .
A goal with no solutions prints false.; a ground goal that succeeds
prints true.. Solutions are fetched in batches and the REPL transparently
fetches more as you page past one, up to a cap — a query enumerating
thousands of answers stays responsive.
Editing and history
The input line is a vim-style editor (the vim-line crate).
- It starts in insert mode — just type. Press Esc for normal
mode (motions
h l w b 0 $, editsx dd C…); a dim-- NORMAL --marker appears bottom-right so you always know the mode.i/a/etc. return to insert. - History: Up/Down arrows (any mode) or
k/j(normal mode) walk previously submitted lines. Your in-progress line is stashed and restored if you walk back down past the newest entry. History persists across sessions in$HOME/.local/share/plgr_history. - Completion: Tab completes the word under the cursor against the builtins, the stdlib, and the predicates you’ve defined this session.
Commands
Commands are single-line and start with :. Most have a short alias.
| Command | Alias | Effect |
|---|---|---|
:load FILE | :l | Consult a .pl file — its clauses are added to the session in order. |
:list | :ls | Show the session buffer, one numbered clause per entry. |
:edit | :e | Open the whole session in $EDITOR; on save it reloads and recompiles. |
:save FILE | Write the current session buffer to FILE. | |
:reset | Clear the session (start over). | |
:help | :h | Show the command summary. |
:quit | :q | Exit. (Ctrl-D or Ctrl-C also quit.) |
:edit is the way to change or delete clauses: it hands the buffer to
your editor as a .pl file, then atomically reloads it. A syntactically
broken edit is rejected and your previous session is left intact.
How it works
Defining a clause recompiles the session buffer to a fresh temporary native
binary; a ?- query re-invokes that current binary. So the cost model is:
- Editing the program pays a (small) compile each time.
- Querying is free of compilation — it just runs the binary you already built.
This is the whole point: the REPL gives an interactive feel by driving the compiler, never by interpreting clauses. A divergent or runaway query is bounded by a timeout and killed without taking the REPL down with it.
Configuration
| Variable | Purpose |
|---|---|
PLGC | Path to the plgc compiler, if not on PATH. |
VISUAL / EDITOR | Editor for :edit (tries $VISUAL, then $EDITOR, then vi). |
PLG_REPL_TIMEOUT | Seconds before a query subprocess is killed (default 10). |
History lives at $HOME/.local/share/plgr_history.
LSP & Editor Guide
plgl is a Language Server for the patch-prolog language. It gives any
LSP-capable editor real-time diagnostics, completion, hover docs, and
go-to-definition over stdio. Unlike plgc, the server links only the
parser — no clang, no runtime — so it’s small and starts instantly.
What it provides
- Diagnostics — syntax/parse errors as you type, plus warnings for
calls to predicates defined nowhere, with did-you-mean suggestions
(e.g. a typo’d
parnet/2suggestsparent/2). These are warnings in the editor; to make them hard build failures, useplgc --deny-undefined(see Compiler Usage). - Completion — builtins, the stdlib, and the predicates defined in the current buffer.
- Hover — a one-line doc for builtins, and the clause head for user-defined predicates.
- Go-to-definition — jumps to a predicate’s first clause.
Completion, hover, and the undefined-predicate check all draw on the same
sources as the compiler (the shared builtin table, the stdlib, the
frontend parser), so the editor agrees with what plgc accepts.
Installing plgl
The toolchain isn’t on crates.io yet, so build from source. The server needs only the parser, so you can install just it:
git clone https://git.navicore.tech/navicore/patch-prolog
cd patch-prolog
cargo install --path crates/lsp
Or install the whole toolchain at once:
just install # plgc, plgr, plgl
Verify with plgl --version. The server speaks LSP over stdio; you don’t
run it directly — your editor launches it.
Neovim
The companion plugin
patch-prolog.nvim
wires up the LSP along with filetype detection (.pl, .plg, .pro →
prolog), syntax highlighting, and indentation. It needs Neovim 0.10+ and
plgl on your PATH.
With lazy.nvim:
{
"navicore/patch-prolog.nvim",
ft = "prolog",
opts = {},
}
Or configure it directly:
require("plgl").setup({
-- cmd = { "plgl" }, -- override the binary path/args if needed
on_attach = function(client, bufnr)
local map = vim.keymap.set
map("n", "gd", vim.lsp.buf.definition, { buffer = bufnr })
map("n", "K", vim.lsp.buf.hover, { buffer = bufnr })
end,
})
Neovim’s built-in detection maps
.plto Perl; the plugin overrides that toprolog.
Any other LSP client
plgl is a standard stdio language server — any client can drive it with
three things:
- Command:
plgl(no arguments; communicates over stdin/stdout). - Language / filetype:
prolog. - Root directory: the project root —
.gitis the usual marker.
For example, a bare Neovim client without the plugin:
vim.lsp.start({
name = "plgl",
cmd = { "plgl" },
root_dir = vim.fs.root(0, { ".git", "Cargo.toml" }),
})
The same three parameters translate to VS Code, Emacs eglot/lsp-mode,
Helix (languages.toml), and others.
Known limitations
- Hover and go-to-definition work on identifier names. Operator-named
builtins (
=..,@<,\+) aren’t hoverable — the cursor-word extraction only spans identifier characters. - Diagnostics are computed per file from the open buffer; cross-file project resolution is not modeled (a predicate defined in another file the buffer doesn’t include reads as undefined).
Examples
The examples/ directory has runnable programs. Two are walked through
here: a dependency graph that shows the language basics, and a schema linter
that shows the headline use case — compiling a rule set into a standalone
checker.
Build any example and query it:
plgc build examples/deps.pl -o deps
./deps --query "needs(app, X)" --format text
deps.pl — facts, rules, and recursion
depends_on(app, auth).
depends_on(app, ui).
depends_on(auth, crypto).
depends_on(ui, render).
depends_on(render, crypto).
% needs/2 — the transitive closure of depends_on.
needs(X, Y) :- depends_on(X, Y).
needs(X, Y) :- depends_on(X, Z), needs(Z, Y).
% two components that share a direct dependency.
shares_dep(A, B) :- depends_on(A, D), depends_on(B, D), A \= B.
A handful of depends_on/2 facts describing a build graph, then rules
that derive new relations. needs/2 is recursive — what a component
depends on, transitively (a direct dependency, or a dependency of one). And
shares_dep/2 uses \=/2 to exclude a component matching itself.
Querying derives answers from the rules — and backtracking yields every solution:
./deps --query "depends_on(app, X)" --format text # direct dependencies
# X = auth
# X = ui
./deps --query "needs(app, X)" --format text # transitive
# X = auth
# X = ui
# X = crypto
# X = render
# X = crypto
./deps --query "shares_dep(auth, render)" --format text
# true.
needs(app, X) reaches crypto twice — once through auth, once through
ui → render — because there are two paths to it; backtracking finds both.
shares_dep(auth, render) holds because both depend on crypto.
Use findall/3 to collect every solution into a list:
./deps --query "findall(D, needs(app, D), Ds)" --format text
# D = _0
# Ds = [auth, ui, crypto, render, crypto]
(D is the template — left unbound; Ds is the collected result.)
linting.pl — compiling a checker into a binary
This is the use case patch-prolog is built for: write the rules, compile them to a single native binary, and run it anywhere with no Prolog system installed. Here the “data” (a schema’s fields) is baked in at build time and the rules flag violations.
field(user, id, integer).
field(user, name, string).
field(user, email, string).
field(user, password, string).
field(user, ssn, string).
allowed_type(string). allowed_type(integer).
allowed_type(boolean). allowed_type(array).
sensitive(ssn).
sensitive(password).
violation(Field, sensitive_field) :-
field(user, Field, _),
sensitive(Field).
violation(Field, unknown_type) :-
field(user, Field, Type),
\+ allowed_type(Type).
plgc build examples/linting.pl -o linting
./linting --query "violation(Field, Reason)" --format text
# Field = password
# Reason = sensitive_field
# Field = ssn
# Reason = sensitive_field
The real power is the exit code: a CI step doesn’t need to parse output at all — a non-zero exit means violations were found.
if ./linting --query "violation(_, _)" >/dev/null; then
echo "schema violations found"; exit 1
fi
And for tooling that does want the data, the default JSON format is ready to parse:
./linting --query "violation(F, R)" --format json
# {"count":2,"exhausted":true,"solutions":[{"F":"password","R":"sensitive_field"},
# {"F":"ssn","R":"sensitive_field"}]}
That linting binary is ~700K, depends only on system libc/libm, and needs
no Prolog runtime to run — hand it to anyone.
Where to next
- Language Guide — the concepts these programs use.
- Compiler Usage — every
plgcflag and the query wire-contract. - REPL Guide — explore a program interactively before compiling it.
Architecture
patch-prolog compiles an ISO-subset Prolog program to a standalone native binary. The compiled binary contains no clause interpreter: predicates are native functions generated as LLVM IR; only primitive services (heap, trail, unification, builtins, query parsing, output) come from a runtime library statically linked into the binary.
Pipeline
rules.pl ──parse──▶ AST ──analyze──▶ codegen ──▶ rules.ll (LLVM IR text)
│
libplg_runtime.a (embedded in plgc) ──┐ │
▼ ▼
clang -O3 -g rules.ll -lplg_runtime -lm
│
▼
rules (standalone binary)
- The compiler emits LLVM IR as text — no llvm-sys/inkwell binding, no LLVM version lock-in beyond “clang ≥ 15” (opaque pointers).
libplg_runtime.ais built fromcrates/runtimeand embedded into theplgcbinary viainclude_bytes!(set up bycrates/compiler/build.rs, which also enforces exact version match between compiler and runtime, and bakes in a content hash of the archive). At link time it is materialized at a content-addressed cache path ($XDG_CACHE_HOME/plgc/runtime-<hash>/, else~/.cache/plgc/…) that every run of the same build reuses; stale entries are age-swept. HOME-less environments fall back to a private per-link extraction that is removed after linking.-Wl,--gc-sections(Linux) /-Wl,-dead_strip(macOS) strips runtime code the program can’t reach, keeping binaries small.- Users of
plgcneed clang. Users of compiled binaries need nothing.
This is the architecture proven by patch-seq: a compiled binary contains no clause interpreter. The rejected alternative — embedding a serialized clause database inside a shipped interpreter — would put the whole interpreter (and a Rust runtime) into every “compiled” program, and is why this engine generates native code per predicate instead.
Crates
| Crate | Artifact | Role |
|---|---|---|
plg-shared | rlib | AtomId + well-known atoms, Term, StringInterner, FirstArgKey, operator table. Linked into BOTH compiler and runtime — zero dependencies, by rule. |
plg-frontend | rlib | Tokenizer + operator-precedence parser + ISO error types (ported from v1). Compiler-side only. |
plg-runtime | staticlib + rlib | The machine substrate compiled code calls into: heap/trail/choice points, generic unify, ~60 builtins, the minimal goal-only --query parser, JSON/text output, process entry. Ships inside every compiled binary. |
plg-compiler | bin plgc + rlib | CLI, codegen (IR text emission), clang driver, runtime embedding. |
plg-lsp | bin plgl | Language server (diagnostics, completion, hover, goto-definition). A frontend consumer — never links the runtime. |
plg-repl | bin plgr | Interactive REPL that drives the compiler; never interprets. |
Dependency rule: nothing heavy (clap, serde, …) may enter plg-runtime
or plg-shared; every byte there lands in every user binary. (The
compiler-side crates — plg-frontend, plg-compiler, plg-lsp,
plg-repl — are dev tooling and carry no such constraint.) This is a
strong default, not an absolute: a dependency that demonstrably pays for
its bytes against the footprint gate, or is scoped to compiler-side crates,
can be considered.
Execution model (summary)
- Each predicate compiles to one LLVM function in continuation-passing
style: it receives the Machine pointer, its arguments as tagged 64-bit
words, and a success continuation. Solutions are delivered by
musttail-calling the continuation; failure is a plain return. - Alternatives (untried clauses, disjunction branches) live on a runtime-managed choice-point stack holding retry function pointers plus heap/trail marks. Backtracking = rewind marks + tail-call retry.
- Cut truncates the choice-point stack to the barrier recorded at predicate entry (stopping at catch frames).
- All transfers are tail calls and continuation frames are heap-allocated, so Prolog recursion depth never grows the C stack; determinate last-goal recursion is a true jump.
- Backtracking resets the heap top to the choice point’s mark — memory reclamation without GC.
Runtime --query support
The compiler bakes two global tables into every binary:
- the atom table (all atom names, in id order), and
- the predicate registry
{functor_id, arity, fn_ptr}.
At startup the runtime rebuilds the name→id map from the atom table, so
a runtime-parsed query interns into the same id space (new atoms get
fresh ids that correctly unify with nothing in the program). The
registry maps a parsed goal to its compiled entry point — this is also
how call/1 and findall/3 re-enter compiled code. Predicates declared
:- dynamic with no clauses are registered to an always-fail stub
(silent-fail linter contract); unknown predicates raise
existence_error(procedure, F/A).
Wire contract (compiled binaries)
Preserved exactly from v1 so existing harnesses keep working:
--query "goal(X)",--limit N,--format json|text- exit
0no solutions ·1solutions found ·2query parse error ·3runtime error - JSON:
{"solutions":[{"X": ...}], "count": N, "exhausted": bool}
Build system
justfile is the source of truth; CI (.forgejo/workflows/ci-linux.yml)
only calls just ci. Recipe ordering matters: build runs
build-runtime before build-compiler so the canonical
target/release/libplg_runtime.a is fresh when build.rs embeds it.
Compilation Model
How Prolog’s nondeterministic control becomes native code. This sits between the Architecture overview and the Runtime ABI symbol-level contract: it explains the shape of the generated code — the function signatures and exact symbols are in the Runtime ABI.
The shape: CPS + a choice-point trampoline
Each predicate compiles to one LLVM function with a uniform signature,
i32 (ptr %m, i64 %env) (returns 0 = fail/exhausted, 1 = stop). Forward
execution is continuation-passing: a goal installs the “rest of the
conjunction” as a continuation and musttails into the callee; a solution is
delivered by musttailing that continuation.
Because every function shares the one signature, musttail is guaranteed
tail-call optimization under the ordinary C calling convention — so recursion
and backtracking never grow the C stack.
This is the simplest design that is genuinely compiled. The alternatives:
- WAM-instruction emission (GNU Prolog’s path) — fastest and most faithful, but needs the full WAM register/environment design up front. A named future escape hatch, not the v1 target.
- A pure choice-point trampoline (success also routed through the stack)
— simple, but bounces once per goal, forfeiting the
musttailwin on determinate iteration. - Interpretation with embedded data — rejected permanently; that was the original patch-prolog’s mistake (a compiler in costume over an interpreter).
Choice points and backtracking
Alternatives — untried clauses, disjunction branches — become entries on a
runtime choice-point stack, each snapshotting the trail, heap top, and
variable counter, plus a retry function. Failure is ret i32 0: it unwinds
the constant-depth tail-call chain to the driver loop (solve.rs), which
pops a choice point, rewinds the trail (undo bindings) and the heap top
(backtrack-deallocation — memory reclaimed with no GC), and musttails
the retry. The driver is the trampoline; depth never grows the C stack.
Clause groups push lazily: a predicate’s entry tries clause 1 and pushes one
choice point whose retry is clause 2, which pushes clause 3, and so on.
First-argument indexing compiles to a switch on the first argument’s
tag/value (the same keys the v1 interpreter used); a single keyed candidate
pushes no choice point.
Cut
The barrier is the choice-point stack height recorded at predicate entry.
! compiles to a runtime call that truncates the stack back to that height,
stopping at catch frames (cut is opaque to catch). If-then-else and
once/1 are local barriers around the condition; \+ G is (G -> fail ; true).
catch / throw
There is no native stack unwinding to fight: continuations are heap frames
and transfers are tail calls, so throw/1 is pure data. The runtime walks
the choice-point stack for the topmost catch frame whose catcher unifies with
the thrown ball, rewinds the trail and heap, and tail-calls the recovery
continuation. The step-limit ball is uncatchable — it skips every catch
frame and exits with code 3.
What is compiled vs. called
Compiled inline, in the clause code itself:
- clause dispatch (the first-argument-indexing
switch), - conjunction sequencing, disjunction branching, and the if-then-else /
once/\+commit machinery (choice points + commit-height slots), - cut,
- atom and immediate-integer constants (their tagged words are baked into the IR).
Runtime calls — the pragmatic scope line; inlining any of these is a named escape hatch if profiles ever demand it (symbols in the Runtime ABI):
- generic unification, including head unification — one shared bind/deref/trail implementation so compiled and runtime paths can’t diverge,
is/2and arithmetic comparison (the v1 evaluator, with its exact error strings),- the deterministic builtin library, plus
findall/3,catch/3,throw/1, the metacall path, andbetween/3.
Worked example
ancestor(X, Z) :- parent(X, Z).
ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z).
ancestor/2’s entry pushes one choice point (retry = clause 2’s body) andmusttails intoparent/2with the caller’s continuation — clause 1’s body goal is in tail position.- Clause 2’s retry allocates a fresh
Y, builds a continuation frame forancestor(Y, Z), andmusttailsparent(X, Y)with it. - That continuation
musttailsancestor/2again — the recursive call is a true jump, so a million-deepancestorchain runs in constant C stack. - When everything is exhausted the outermost call returns
0; the driver reports the solution count and exit code.
Guarantees and guards
| risk | guard |
|---|---|
musttail emission bug → stack growth | codegen post-check (every musttail is followed by a bare ret) + a deep-recursion integration test |
| compiled vs. generic unify divergence | shared primitives + differential tests against the v1 interpreter oracle |
| atom-id divergence compile ↔ runtime | a single emitted atom table + round-trip unit tests |
| runaway heap in determinate queries | the uncatchable step limit (copying GC is a future escape hatch) |
Runtime ABI
The C-ABI surface between plgc-generated LLVM IR and libplg_runtime.a.
Rust side: crates/runtime/src/abi.rs + entry.rs; IR side: the
RUNTIME_DECLS block in crates/compiler/src/codegen/program.rs. Those two
files are the contract; this document explains it. The build.rs
version-sync guarantees an embedded runtime always matches the compiler that
emits calls into it.
The uniform function signature
Every compiled predicate, clause, chain (“try next clause”), continuation, and retry function — and every runtime-provided continuation — shares one C-ABI prototype:
i32 (ptr %machine, i64 %env)
Returns 0 = fail/exhausted (the driver loop backtracks), 1 = stop
(solution limit reached / enumeration finished early). The uniform prototype
is what makes musttail transfers guaranteed-TCO under the C calling
convention (same trick protobuf’s parser uses), and it lets Rust functions
(the solve driver, the capture continuation, ,/2 at query level)
participate at chain boundaries via plain function pointers.
Function pointers cross the boundary as i64 (ptrtoint/inttoptr in IR,
transmute in Rust); frames and environments are heap indices, never
raw pointers — the heap is a growable Vec<u64> and indices survive
reallocation and backtracking truncation.
Control flow contract
- Forward execution: generated code installs a continuation with
plg_rt_set_kandmusttails into the callee’s entry function. Solution delivery is amusttailinto the installed k. - Failure:
ret i32 0unwinds the (constant-depth) tail-call chain to the runtime driver loop (solve.rs), which pops a choice point, rewinds trail+heap to its marks, and calls its retry function. The driver is the trampoline; the C stack never grows with recursion or backtracking depth. - Choice points:
plg_rt_push_cp(m, retry_fn, env)snapshots trail/heap marks at push time. Predicate entries push one CP per remaining clause group lazily (entry pushes t1; t1 pushes t2; …). - Frames (predicate:
[args…, k_fn, k_env]; clause body:[k_fn, k_env, vars…]) are allocated withplg_rt_frame_allocbefore any CP that must see them — heap rewind frees everything allocated after the CP, so ordering is the correctness rule.
Generated global data
@plg_atom_strs = [N x ptr] ; atom names, id order
@plg_registry = [K x { i32, i32, ptr }] ; functor, arity, entry fn
plg_rt_init re-interns the atom names in order, reproducing the compiler’s
exact id space (both sides pre-seed the same well-known atoms); query atoms
unseen at compile time get fresh ids ≥ N and correctly unify with nothing.
The registry is sorted by (functor, arity) for binary search; :- dynamic
predicates with no clauses point at @plg_rt_pred_fail (silent-fail
contract).
Term words (tag in low 3 bits)
REF=0 (heap idx, unbound = self-ref) · ATOM=1 (id) · INT=2 (immediate
i61; integers beyond the i61 range are boxed as TAG_BIG, below) · STR=3
(idx → [functor:u32|arity:u32][args…]) · LST=4 (idx → [head][tail]) ·
FLT=5 (idx → f64 bits, to_bits equality) · TAG_BIG=6 (idx → boxed i64,
for integers beyond the i61 immediate).
Atoms and immediate integers never cross the ABI as calls: codegen emits
their tagged words as compile-time constants ((id<<3)|1, (n<<3)|2 —
mirrored in cell.rs and term_emit.rs, covered by tests on both sides).
The plg_rt_* surface
| symbol | signature (IR) | purpose |
|---|---|---|
plg_rt_init | ptr (ptr, i32, ptr, i32) | build Machine from atom table + registry |
plg_rt_main | i32 (ptr, i32, ptr) | argv parse, query parse, solve, output, exit code |
plg_rt_step | i32 (ptr) | step counter; 0 ⇒ uncatchable resource_error set |
plg_rt_new_var | i64 (ptr) | fresh unbound REF cell |
plg_rt_frame_alloc | i64 (ptr, i32) | raw continuation-frame cells |
plg_rt_frame_set/get | void/i64 (ptr, i64, i32[, i64]) | frame slot access |
plg_rt_areg_get/set | i64/void (ptr, i32[, i64]) | argument registers |
plg_rt_breg_set | void (ptr, i32, i64) | build registers for put_struct |
plg_rt_put_struct | i64 (ptr, i32, i32) | compound from breg[0..arity] |
plg_rt_put_list | i64 (ptr, i64, i64) | cons cell |
plg_rt_put_float | i64 (ptr, i64) | boxed f64 (bits) |
plg_rt_put_big | i64 (ptr, i64) | box an i64 beyond the i61 immediate (TAG_BIG) |
plg_rt_unify | i32 (ptr, i64, i64) | generic trail-recording unify |
plg_rt_set_k / plg_rt_k_fn / plg_rt_k_env | — | continuation register |
plg_rt_push_cp | void (ptr, i64, i64) | choice point (retry fn + env) |
plg_rt_cp_top | i64 (ptr) | choice-point height (cut barriers, commit slots) |
plg_rt_cut | void (ptr, i64) | truncate cps to height (stops at CATCH frames — catch is opaque to cut) |
plg_rt_deref | i64 (ptr, i64) | first-arg indexing dispatch |
plg_rt_str_key | i64 (ptr, i64) | STR → packed functor cell, else u64::MAX |
plg_rt_pred_fail | i32 (ptr, i64) | always-fail body (dynamic stubs) |
plg_rt_existence_error | i32 (ptr, i32, i32) | sets v1-format error, returns 0 |
Inline comparison and arithmetic builtins
| symbol | signature (IR) | purpose |
|---|---|---|
plg_rt_b_is | i32 (ptr, i64, i64) | is/2 (v1 eval semantics, exact error strings) |
plg_rt_b_arith_cmp | i32 (ptr, i32, i64, i64) | op 0..5 = < > =< >= =:= =\= |
plg_rt_b_neq | i32 (ptr, i64, i64) | \= (trail-rewinding attempt) |
plg_rt_b_term_cmp | i32 (ptr, i32, i64, i64) | op 0..5 = == \== @< @> @=< @>= |
plg_rt_b_compare | i32 (ptr, i64, i64, i64) | compare/3 |
Op-code tables live in codegen lower.rs and runtime control.rs and must
stay in sync. Control constructs (;, ->, \+, once) have NO runtime
symbols for compiled code — they compile to choice points and commit slots
(body.rs); the runtime’s control.rs mirrors the same shapes only for goals
built at runtime (queries, metacalls), walking goal terms, never clauses.
Errors, metacall, and control builtins
Structured errors. Machine::error carries a relocatable ball
(copyterm::TermBuf) plus its pre-rendered v1-format message. Balls survive
heap rewinding; solve::drive unwinds them to the nearest catch frame (a
CpKind::Catch choice point whose env holds [catcher, recovery, k_fn, k_env]). plg_rt_cut stops at catch frames (catch is opaque to cut). The
step-limit ball remains uncatchable.
Control builtins (the installed k is the continuation; the runtime walks goal TERMS only — never clauses):
| symbol | signature (IR) | purpose |
|---|---|---|
plg_rt_metacall | i32 (ptr, i64) | dispatch a runtime goal term (var goals, call/N) |
plg_rt_b_catch_3 | i32 (ptr, i64, i64, i64) | push catch frame, run goal |
plg_rt_b_throw_1 | i32 (ptr, i64) | snapshot ball, raise |
plg_rt_b_findall_3 | i32 (ptr, i64, i64, i64) | bounded sub-search via the shared driver |
plg_rt_pred_between_3 | i32 (ptr, i64) | nondet builtin, uniform predicate signature (A-registers) |
C-stack note: plg_rt_metacall / plg_rt_b_catch_3 / plg_rt_b_findall_3
hold one Rust frame while their goal runs, so nesting depth of those
constructs (not Prolog recursion) consumes C stack — bounded by the step
limit; full CPS compilation of metacalls is a named escape hatch.
Deterministic builtins plg_rt_b_<name>_<arity> — one symbol per entry
in codegen’s DET_BUILTINS table (lower.rs), declarations generated from the
same table. The runtime’s query-side dispatch (control.rs det_builtin)
mirrors it; the differential corpus guards the pair. Vocabulary = exactly
v1’s: type checks, functor/arg/univ/copy_term, atom/number conversions,
msort/sort, succ/plus, unify_with_occurs_check, write/writeln/nl.
Stdlib. crates/compiler/stdlib.pl (v1’s file, verbatim) is parsed
before user sources in every plgc build — member/append/length/last/
reverse/nth0/nth1 are ordinary compiled predicates in every binary, exactly
like v1’s embedding.
Environment
PLG_MAX_STEPS overrides the default 10,000 step ceiling (documented
extension over v1, which hardcoded it; the CLI contract is unchanged).