Website Intelligence Lab — Repo Digest
The foundational engineering repository of the Intelligence Platform layer for Inexis Digital. It is documented as a platform component; the platform-level view (with executive, container, component, and runtime diagrams) is in architecture/intelligence-platform.md.
What it is
An experimentation and evaluation platform — a “scientific instrument” whose purpose is not
hosting websites but enabling repeatable, measurable improvement of website assessment, migration,
and optimisation capabilities across website technologies. Websites are subjects of study; WordPress is
the first backend; measurement is the point. (Source: README.md, docs/philosophy.md.)
Why it exists
Inexis Digital’s value depends on engines (Website Assessment, Migration, Proposal, AI Search) getting
measurably better over time. You cannot improve what you cannot reproduce and compare. The lab exists to
provide the reproducible harness, the corpus of subject businesses, and the immutable record of every
experiment so that any engine version can be objectively evaluated against prior versions — answering
questions like “Did Migration Engine v9.4 beat v9.3, and can we prove exactly why?” It is deliberately
not the engines themselves (they live in separate repos and run against the lab) and not a
production host. (Source: CLAUDE.md “What this repository is / is NOT”.)
At a glance
| Field | Value |
|---|---|
| Slug | website-intelligence-lab |
| System | intelligence-platform |
| Architecture layer | Intelligence Platform (layer 2) |
| Owner org | Inexis Digital (inexisdigital.com.au) |
| Lifecycle | active |
| Maturity | early-development (infra operational; core loop not yet built) |
| Repo type | Git repo, MADR-style ADRs, phase-gated |
| Phase status | 1 ✅ · 2 ✅ (VPS-validated) · 2.5 🔄 in progress · 3–6 ⏳ pending |
| Last reviewed | 2026-07-05 |
Business capability provided
Reproducible measurement & evaluation of website-intelligence capabilities. It gives Inexis Digital an objective, provenanced basis to evolve its service engines, backed by a curated corpus of real and synthetic businesses and an immutable experiment history.
Technical responsibilities
Per CLAUDE.md, the lab owns exactly three things and explicitly disclaims the rest:
- The infrastructure that hosts generated and test websites.
- The corpus of businesses those websites belong to.
- The immutable record of every experiment run against them.
It does not own: the engines (separate repos), production hosting, or knowledge content (referenced, never copied).
Core concepts
The domain model (memorised verbatim in CLAUDE.md, detailed in docs/domain-model.md):
| Concept | Meaning |
|---|---|
| Capability | Declarative service taxonomy — what Inexis knows how to do. Owns no execution. |
| Business | The subject: `origin: real |
| Digital Asset | Two orthogonal axes: category (observed | reference | generated) × role (input | target | output | fixture). |
| Case | The unit of work & evaluation (supersedes Project). Selects input (observed) + optional target (reference); references Capabilities; accumulates Runs. |
| Run | One immutable, fully-provenanced experiment execution. Re-execution mints a new run-id. |
| Reference asset | Curated, versioned best-practice exemplar (Astra, Kadence, GeneratePress). Owned by no business. |
| Benchmark | Versioned dataset of Cases, tiered by observed-input quality (Gold/Good/Average/Poor). |
The single most important split: authored inputs (corpus/) and generated outputs (runs/)
never share a folder.
Key workflows
- Author a subject — define a Business (
business.yml), register its Observed assets (real presence, referenced + snapshotted, never rehosted;capture_pendinguntil the crawler exists). - Define a Case — reference Capabilities, select an Observed input and optional Reference target.
- Execute a Run — an external engine runs against the lab, producing immutable Generated assets in
runs/<business>/<case>/<run>/with complete provenance (engine+version, model, prompt/knowledge versions, config, input lineage). - Evaluate — score engine versions against Benchmarks tiered by input quality; diff runs to explain differences.
- Reference knowledge — external intelligence is pinned by immutable ref in
knowledge/, never copied.
Technologies used
- Host/infra: Docker + Docker Compose (project
wil), Ubuntu 24.04. - Edge/TLS: Caddy
2.11.4+ Cloudflare DNS pluginv0.2.4(wildcard TLS via DNS-01), Let’s Encrypt/ACME. - Backend: WordPress
6.9-php8.3-apacheMultisite (subdomain mode), WP-CLI, MariaDB11.4.12, phpMyAdmin. - DNS: Cloudflare (
wp-lab.inexisdigital.com.au). - Contracts: YAML schemas in
docs/contracts/; scripts favour POSIXsh+ a Makefile entrypoint. - Builds are pinned/deterministic (ADR-0010).
Major modules / components
The repo is organised into domains separated by change-rate, owner, and responsibility:
| Domain | Path | Responsibility | Change-rate |
|---|---|---|---|
| Infrastructure | infrastructure/ |
The host: Docker, Caddy, MariaDB, WP Multisite | Rarely |
| Platforms | platforms/ |
Technology adapters + adapter contract (WordPress = reference) | Per new tech |
| Capabilities | capabilities/ |
Service taxonomy registry (declarative) | Occasionally |
| Reference | reference/ |
Curated, versioned best-practice targets | Occasionally |
| Corpus | corpus/ |
Businesses, Observed assets, Cases, benchmarks (INPUTS) | Continuously |
| Runs | runs/ |
Immutable, provenanced experiment records (OUTPUTS) | Constantly |
| Knowledge | knowledge/ |
Pinned references to external knowledge repos | Independently |
| Services | services/ |
Future composable compute (crawler, screenshot…) | Additively |
Capabilities
reproducible-experiment-harness— immutable, provenanced run execution & comparisoncorpus-management— real/synthetic businesses, observed assets, cases, benchmarksevaluation-benchmarking— quality-tiered benchmark datasets for engine evaluationreference-catalog— shared versioned best-practice targetscapability-taxonomy— declarative registry of Inexis servicesknowledge-referencing— pinned external-knowledge boundaryplatform-adapter-contract— technology-agnostic provision/teardown/snapshot interface
Upstream dependencies
| Dependency | Type | Version | Notes |
|---|---|---|---|
| External engines (Assessment, Migration, Proposal, AI Search) | internal-repo (separate) | n/a | Run against the lab; produce runs |
| External knowledge repositories | internal-repo (by reference) | pinned SHA/hash | Never copied in (ADR-0006) |
| Docker / Docker Compose | platform | — | Host substrate |
| Caddy + Cloudflare DNS plugin | external-package | 2.11.4 / v0.2.4 | Pinned together (ADR-0010) |
| WordPress + MariaDB + WP-CLI | external-package | 6.9 / 11.4.12 | First backend |
| Cloudflare | external-service | — | DNS + DNS-01 TLS |
| Shared Skills | internal-repo | n/a | Used to engineer/document this repo (ecosystem layer 1) |
Downstream consumers
- The engines — consume the lab as their execution & evaluation harness (they write immutable runs).
- Inexis Digital service delivery — capabilities → customer work (assessment, migration, SEO, etc.).
- Intelligence products — relationship to the
intelproductsrepo (intelligence packs) is documented at the platform level; see Intelligence Platform. (Cross-repo linkage recorded there to avoid over-claiming here.) - Higher ecosystem layers (Applications & Agents, Ventures) consume the platform’s outputs — planned.
Major interfaces & integration points
- Consumes: external engines (execution), external knowledge (pinned refs), Cloudflare/Docker/WordPress.
- Exposes:
- Adapter contract (
platforms/README.md) — provision/teardown/snapshot, technology-agnostic. - Contract schemas (
docs/contracts/) —business,digital-asset,case,run-manifest,reference,benchmark,capability— the seams between domains. - Immutable run manifests — the provenanced record other systems evaluate.
- Capabilities registry (
capabilities/registry.yml) — the shared service vocabulary.
- Adapter contract (
Reusable assets exposed to other repositories
- The domain model + YAML contract schemas (a reusable pattern for provenanced experiment platforms).
- The reference catalog of best-practice targets (Astra / Kadence / GeneratePress, versioned).
- The deterministic infrastructure stack (Compose + Caddy + Cloudflare DNS + WP Multisite).
- The capabilities taxonomy as the bridge between platform modules and customer work.
Architectural decisions
11 ADRs (MADR-style; 9 Accepted, 2 Proposed) — full index in the repo’s docs/adr/:
- 0002 Technology-agnostic lab (not a WordPress repo) · 0003 The subject is a Business, not a Website
- 0004 Runs are immutable & fully provenanced · 0005 Capabilities as a first-class taxonomy
- 0006 External knowledge by pinned reference · 0009 Reverse-proxy topology + subdomain Multisite
- 0010 Deterministic image/Caddy builds · 0011 Asset categories, Cases, and the reference domain
- 0007 Run-id scheme (Proposed) · 0008 Object storage for run blobs (Proposed)
These decisions seed several ecosystem-wide Architecture Principles.
Architecture snapshot
graph TD
subgraph Inputs["corpus/ — INPUTS (authored, in Git)"]
BIZ[Businesses + Observed assets]
CASE[Cases → reference Capabilities]
BENCH[Benchmarks]
end
REF[reference/ — best-practice targets]
CAP[capabilities/ — service taxonomy]
KNOW[knowledge/ — external, pinned]
ENG[[External Engines]]
SVC[services/ — future compute]
subgraph Outputs["runs/ — OUTPUTS (immutable, provenanced)"]
RUN[Runs → Generated assets]
end
INFRA[infrastructure/ — Docker · Caddy · WP Multisite]
CAP -.referenced by.-> CASE
BIZ --> CASE --> ENG
REF -->|target| ENG
KNOW -->|pinned versions| ENG
ENG --> RUN
SVC --> RUN
BENCH -->|evaluate| RUN
INFRA -->|hosts generated/fixture sites| RUN
Current maturity
Early-development. Rationale (evidence-based): Phase 1 (scaffold) and Phase 2 (infrastructure —
Compose, Caddy, MariaDB, WP Multisite) are complete and VPS-validated; Phase 2.5 (corpus seeding) is
in progress. The core lab loop — platform adapter (Phase 3), corpus model & validation (Phase 4),
run/provenance capture (Phase 5), and DX (Phase 6) — is not yet built, and the engines that produce
runs are external and separate. So infrastructure is operational but the platform’s central capability
(reproducible runs & evaluation) is still ahead. (Source: README.md phase table, CLAUDE.md status.)
Roadmap
- Near term: finish corpus seeding (2.5) → WordPress adapter (3) → corpus model/validation (4) → immutable runs & provenance capture (5) → developer experience/Makefile (6).
- Future domains (acknowledged, not scaffolded):
experiments/(compare runs across strategies/models); shared Resource registry (first instance already exists asreference/); additional platform adapters (Wix, Squarespace, Shopify, Webflow, static HTML); additional engines (all writing immutable runs). (Source:docs/roadmap.md.)
Known limitations
- Crawler not built → Observed assets are
capture_pending; real inputs cannot yet be captured. - Core loop unbuilt — adapters, run/provenance capture, and evaluation (Phases 3–5) are pending.
- Engines are external and (at review time) not evidenced in this portal as registered repos.
- Single-org scope — Inexis Digital website domain; multi-tech breadth is designed but WordPress-only today.
- Reproducibility depends on upstream knowledge versioning discipline (mitigated by pinning to immutable refs).
Future opportunities
- Stand up the Experiments domain once multiple runs need cross-comparison.
- Onboard a second platform adapter to prove technology-agnosticism in practice.
- Use the lab to benchmark engine versions at scale, turning capability improvement into a measured pipeline.
- Generalise the provenanced-run pattern as a reusable ecosystem asset (see principles).
Relationship to the wider AI venture ecosystem
This repo is the engineering foundation of the Intelligence Platform (layer 2) for Inexis Digital. It consumes Shared Skills (layer 1) for its own engineering and documentation, and its outputs (measured capabilities, evaluated engines, generated assets) feed Intelligence Products (layer 3) and, above them, Applications, Agents, and Ventures. See the Intelligence Platform page and the Portfolio Overview.
Links
- Source: local git repo
website-intelligence-lab(private) - System digest:
intelligence-platform - Platform doc: Intelligence Platform
- Registry row: repo registry
- Principles: Architecture Principles