Website Intelligence Lab — Repo Digest

The foundational engineering repository of the Intelligence Platform layer for Inexis Digital. It is documented as a platform component; the platform-level view (with executive, container, component, and runtime diagrams) is in architecture/intelligence-platform.md.

What it is

An experimentation and evaluation platform — a “scientific instrument” whose purpose is not hosting websites but enabling repeatable, measurable improvement of website assessment, migration, and optimisation capabilities across website technologies. Websites are subjects of study; WordPress is the first backend; measurement is the point. (Source: README.md, docs/philosophy.md.)

Why it exists

Inexis Digital’s value depends on engines (Website Assessment, Migration, Proposal, AI Search) getting measurably better over time. You cannot improve what you cannot reproduce and compare. The lab exists to provide the reproducible harness, the corpus of subject businesses, and the immutable record of every experiment so that any engine version can be objectively evaluated against prior versions — answering questions like “Did Migration Engine v9.4 beat v9.3, and can we prove exactly why?” It is deliberately not the engines themselves (they live in separate repos and run against the lab) and not a production host. (Source: CLAUDE.md “What this repository is / is NOT”.)

At a glance

Field	Value
Slug	`website-intelligence-lab`
System	`intelligence-platform`
Architecture layer	`Intelligence Platform` (layer 2)
Owner org	Inexis Digital (`inexisdigital.com.au`)
Lifecycle	`active`
Maturity	`early-development` (infra operational; core loop not yet built)
Repo type	Git repo, MADR-style ADRs, phase-gated
Phase status	1 ✅ · 2 ✅ (VPS-validated) · 2.5 🔄 in progress · 3–6 ⏳ pending
Last reviewed	2026-07-05

Business capability provided

Reproducible measurement & evaluation of website-intelligence capabilities. It gives Inexis Digital an objective, provenanced basis to evolve its service engines, backed by a curated corpus of real and synthetic businesses and an immutable experiment history.

Technical responsibilities

Per CLAUDE.md, the lab owns exactly three things and explicitly disclaims the rest:

The infrastructure that hosts generated and test websites.
The corpus of businesses those websites belong to.
The immutable record of every experiment run against them.

It does not own: the engines (separate repos), production hosting, or knowledge content (referenced, never copied).

Core concepts

The domain model (memorised verbatim in CLAUDE.md, detailed in docs/domain-model.md):

Concept	Meaning
Capability	Declarative service taxonomy — what Inexis knows how to do. Owns no execution.
Business	The subject: `origin: real
Digital Asset	Two orthogonal axes: category (`observed \| reference \| generated`) × role (`input \| target \| output \| fixture`).
Case	The unit of work & evaluation (supersedes Project). Selects input (observed) + optional target (reference); references Capabilities; accumulates Runs.
Run	One immutable, fully-provenanced experiment execution. Re-execution mints a new run-id.
Reference asset	Curated, versioned best-practice exemplar (Astra, Kadence, GeneratePress). Owned by no business.
Benchmark	Versioned dataset of Cases, tiered by observed-input quality (Gold/Good/Average/Poor).

The single most important split: authored inputs (corpus/) and generated outputs (runs/) never share a folder.

Key workflows

Author a subject — define a Business (business.yml), register its Observed assets (real presence, referenced + snapshotted, never rehosted; capture_pending until the crawler exists).
Define a Case — reference Capabilities, select an Observed input and optional Reference target.
Execute a Run — an external engine runs against the lab, producing immutable Generated assets in runs/<business>/<case>/<run>/ with complete provenance (engine+version, model, prompt/knowledge versions, config, input lineage).
Evaluate — score engine versions against Benchmarks tiered by input quality; diff runs to explain differences.
Reference knowledge — external intelligence is pinned by immutable ref in knowledge/, never copied.

Technologies used

Host/infra: Docker + Docker Compose (project wil), Ubuntu 24.04.
Edge/TLS: Caddy 2.11.4 + Cloudflare DNS plugin v0.2.4 (wildcard TLS via DNS-01), Let’s Encrypt/ACME.
Backend: WordPress 6.9-php8.3-apache Multisite (subdomain mode), WP-CLI, MariaDB 11.4.12, phpMyAdmin.
DNS: Cloudflare (wp-lab.inexisdigital.com.au).
Contracts: YAML schemas in docs/contracts/; scripts favour POSIX sh + a Makefile entrypoint.
Builds are pinned/deterministic (ADR-0010).

Major modules / components

The repo is organised into domains separated by change-rate, owner, and responsibility:

Domain	Path	Responsibility	Change-rate
Infrastructure	`infrastructure/`	The host: Docker, Caddy, MariaDB, WP Multisite	Rarely
Platforms	`platforms/`	Technology adapters + adapter contract (WordPress = reference)	Per new tech
Capabilities	`capabilities/`	Service taxonomy registry (declarative)	Occasionally
Reference	`reference/`	Curated, versioned best-practice targets	Occasionally
Corpus	`corpus/`	Businesses, Observed assets, Cases, benchmarks (INPUTS)	Continuously
Runs	`runs/`	Immutable, provenanced experiment records (OUTPUTS)	Constantly
Knowledge	`knowledge/`	Pinned references to external knowledge repos	Independently
Services	`services/`	Future composable compute (crawler, screenshot…)	Additively

Capabilities

reproducible-experiment-harness — immutable, provenanced run execution & comparison
corpus-management — real/synthetic businesses, observed assets, cases, benchmarks
evaluation-benchmarking — quality-tiered benchmark datasets for engine evaluation
reference-catalog — shared versioned best-practice targets
capability-taxonomy — declarative registry of Inexis services
knowledge-referencing — pinned external-knowledge boundary
platform-adapter-contract — technology-agnostic provision/teardown/snapshot interface

Upstream dependencies

Dependency	Type	Version	Notes
External engines (Assessment, Migration, Proposal, AI Search)	internal-repo (separate)	n/a	Run against the lab; produce runs
External knowledge repositories	internal-repo (by reference)	pinned SHA/hash	Never copied in (ADR-0006)
Docker / Docker Compose	platform	—	Host substrate
Caddy + Cloudflare DNS plugin	external-package	2.11.4 / v0.2.4	Pinned together (ADR-0010)
WordPress + MariaDB + WP-CLI	external-package	6.9 / 11.4.12	First backend
Cloudflare	external-service	—	DNS + DNS-01 TLS
Shared Skills	internal-repo	n/a	Used to engineer/document this repo (ecosystem layer 1)

Downstream consumers

The engines — consume the lab as their execution & evaluation harness (they write immutable runs).
Inexis Digital service delivery — capabilities → customer work (assessment, migration, SEO, etc.).
Intelligence products — relationship to the intelproducts repo (intelligence packs) is documented at the platform level; see Intelligence Platform. (Cross-repo linkage recorded there to avoid over-claiming here.)
Higher ecosystem layers (Applications & Agents, Ventures) consume the platform’s outputs — planned.

Major interfaces & integration points

Consumes: external engines (execution), external knowledge (pinned refs), Cloudflare/Docker/WordPress.
Exposes:
- Adapter contract (platforms/README.md) — provision/teardown/snapshot, technology-agnostic.
- Contract schemas (docs/contracts/) — business, digital-asset, case, run-manifest, reference, benchmark, capability — the seams between domains.
- Immutable run manifests — the provenanced record other systems evaluate.
- Capabilities registry (capabilities/registry.yml) — the shared service vocabulary.

Reusable assets exposed to other repositories

The domain model + YAML contract schemas (a reusable pattern for provenanced experiment platforms).
The reference catalog of best-practice targets (Astra / Kadence / GeneratePress, versioned).
The deterministic infrastructure stack (Compose + Caddy + Cloudflare DNS + WP Multisite).
The capabilities taxonomy as the bridge between platform modules and customer work.

Architectural decisions

11 ADRs (MADR-style; 9 Accepted, 2 Proposed) — full index in the repo’s docs/adr/:

0002 Technology-agnostic lab (not a WordPress repo) · 0003 The subject is a Business, not a Website
0004 Runs are immutable & fully provenanced · 0005 Capabilities as a first-class taxonomy
0006 External knowledge by pinned reference · 0009 Reverse-proxy topology + subdomain Multisite
0010 Deterministic image/Caddy builds · 0011 Asset categories, Cases, and the reference domain
0007 Run-id scheme (Proposed) · 0008 Object storage for run blobs (Proposed)

These decisions seed several ecosystem-wide Architecture Principles.

Architecture snapshot

graph TD
    subgraph Inputs["corpus/ — INPUTS (authored, in Git)"]
        BIZ[Businesses + Observed assets]
        CASE[Cases → reference Capabilities]
        BENCH[Benchmarks]
    end
    REF[reference/ — best-practice targets]
    CAP[capabilities/ — service taxonomy]
    KNOW[knowledge/ — external, pinned]
    ENG[[External Engines]]
    SVC[services/ — future compute]
    subgraph Outputs["runs/ — OUTPUTS (immutable, provenanced)"]
        RUN[Runs → Generated assets]
    end
    INFRA[infrastructure/ — Docker · Caddy · WP Multisite]

    CAP -.referenced by.-> CASE
    BIZ --> CASE --> ENG
    REF -->|target| ENG
    KNOW -->|pinned versions| ENG
    ENG --> RUN
    SVC --> RUN
    BENCH -->|evaluate| RUN
    INFRA -->|hosts generated/fixture sites| RUN

Current maturity

Early-development. Rationale (evidence-based): Phase 1 (scaffold) and Phase 2 (infrastructure — Compose, Caddy, MariaDB, WP Multisite) are complete and VPS-validated; Phase 2.5 (corpus seeding) is in progress. The core lab loop — platform adapter (Phase 3), corpus model & validation (Phase 4), run/provenance capture (Phase 5), and DX (Phase 6) — is not yet built, and the engines that produce runs are external and separate. So infrastructure is operational but the platform’s central capability (reproducible runs & evaluation) is still ahead. (Source: README.md phase table, CLAUDE.md status.)

Roadmap

Near term: finish corpus seeding (2.5) → WordPress adapter (3) → corpus model/validation (4) → immutable runs & provenance capture (5) → developer experience/Makefile (6).
Future domains (acknowledged, not scaffolded): experiments/ (compare runs across strategies/models); shared Resource registry (first instance already exists as reference/); additional platform adapters (Wix, Squarespace, Shopify, Webflow, static HTML); additional engines (all writing immutable runs). (Source: docs/roadmap.md.)

Known limitations

Crawler not built → Observed assets are capture_pending; real inputs cannot yet be captured.
Core loop unbuilt — adapters, run/provenance capture, and evaluation (Phases 3–5) are pending.
Engines are external and (at review time) not evidenced in this portal as registered repos.
Single-org scope — Inexis Digital website domain; multi-tech breadth is designed but WordPress-only today.
Reproducibility depends on upstream knowledge versioning discipline (mitigated by pinning to immutable refs).

Future opportunities

Stand up the Experiments domain once multiple runs need cross-comparison.
Onboard a second platform adapter to prove technology-agnosticism in practice.
Use the lab to benchmark engine versions at scale, turning capability improvement into a measured pipeline.
Generalise the provenanced-run pattern as a reusable ecosystem asset (see principles).

Relationship to the wider AI venture ecosystem

This repo is the engineering foundation of the Intelligence Platform (layer 2) for Inexis Digital. It consumes Shared Skills (layer 1) for its own engineering and documentation, and its outputs (measured capabilities, evaluated engines, generated assets) feed Intelligence Products (layer 3) and, above them, Applications, Agents, and Ventures. See the Intelligence Platform page and the Portfolio Overview.