AQEA Gen-5 · Substrate Properties

The six load-bearing properties of the AQEA substrate.

Why Gen-5 is a generation distinction, not an incremental improvement. P1–P6 explained for engineers and technical reviewers.

Read full whitepaper Verify cross-vendor

P1–P6Cross-platform reproduciblePre-registered hypothesesPhase-J transparent

Multi-channel Orthogonal

The substrate is decomposed into a finite, fixed set of channels. Operations on different channel groups are mutually non-interfering — encoding information into one channel does not perturb the content of another. Empirically validated with measurably zero cross-channel interference. Float vectors from any encoder do not have this property: every coordinate participates in every operation.

Empirical Anchor

Δ = 0.00 pp (exact spectral-orthogonality)

Byte-Deterministic Cross-Platform

The encoder produces byte-identical output across ARM NEON, x86 AVX-512, NVIDIA Vulkan and Apple Metal — verified across nine independent test fixtures. A customer encodes a corpus once and ships the encoded artefact to any platform without re-validation. This is a substrate-and-encoder property, not a property of the search algorithm running on top of it.

Empirical Anchor

26 / 26 tests PASS cross-platform · 9 / 9 hash-verification fixtures

Structural Compression with Ranking Preservation

The encoded representation is approximately 23–29× smaller per document than the float input, and distances computed in the substrate rank documents in the same order as the underlying float-space — bit-identically at the Top-K level when the float reference is itself exact. The compression is structural (information reorganised into channel decomposition), not lossy quantisation.

Empirical Anchor

23–29× compression with bit-identical Top-K

Encoder-Family-Agnostic

The ranking-preservation property holds across both transformer-learned encoders (BGE for text, WavLM for speech, ESM-2 for protein — three independent transformer families) and classical hand-engineered encoders (FFT-spectral signal-processing pipelines). P4 distinguishes AQEA from compression schemes that exploit transformer-specific output statistics.

Empirical Anchor

13 domains × 6 transformer-encoder families + 4 classical-DSP, all above 80% floor

Noise-Resistant Ranking on Signal-Domains

On encoder families that produce noise-bearing float-output — specifically classical signal-processing encoders over sensor streams — distances in the substrate can exceed the float-baseline's nearest-neighbour quality, not merely preserve it. The mechanism: the float-mantissa contains high-frequency variance from sensor noise; trit-discretisation collapses this noise-floor, leaving distance computation to rank by signal-relevant differences.

Empirical Anchor

voraus-ad 133% · Digit_Fall 174% R-Ratio (Supra-Trit)

Phase-J · Falsification Box

Mass-Spec: 96.7% PASS-tier, not EXCEEDS.

On mass-spectrometry, AQEA achieves 96.7% retrieval-equivalent — a PASS-tier result, but it does not exceed the float baseline as it does on robotic-arm and wearable-fall signals. The reason is in the data: archive pre-processing strips per-bin noise from the mass-spec stream, so there is no high-frequency variance left for trit-discretisation to collapse.

We publish this as a feature, not a weakness. P5's mechanism is well-specified enough that we can name in advance which signal-domains will benefit and which will only match.

Reversibly Decodable into Pareto-Front

A trainable inverse-decoder reconstructs a float-representation from the substrate, with task-preservation measured against the substrate's own direct-ranking baseline. Mode-selection is a deployment-time choice — the same substrate-encoded artefact is decoded in any mode without re-encoding the corpus.

Empirical Anchor

3-mode Pareto-front + task-elevating regime at msmarco-100k

Audit Mode

≥ 96.7%

per-vector cosine ≥ 0.90 · audit-trail workloads

General Mode

≥ 98.7%

per-vector cosine ≈ 0.75 · balanced fidelity / discrimination

Pure Retrieval Mode

≥ 99.7%

per-vector cosine ≈ 0.65 · msmarco-1M

Task-Elevating Regime

101.15%

at msmarco-100k — decoded ranking exceeds direct-substrate ranking

Closing

Why this is a generation distinction.

(P1) + (P2) + (P3) define the static substrate. (P4) + (P5) extend it across encoder paradigms and noise-bearing pipelines. (P6) gives the substrate a reversibly-decodable companion that downstream applications use without re-encoding.

A Gen-4 embedder produces an unstructured float vector and inherits its lack of internal channels. AQEA is not a post-processing step on that vector. It is a different output type, produced by an encoder whose codomain has structure that the Gen-4 codomain does not. Multi-channel co-encoding, deterministic content-addressing across hardware vendors, structural compression with bit-identical ranking — all follow from the substrate's structure, not from any single algorithmic trick.

The construction of the substrate, the channel decomposition, the deterministic encoder pipeline and the trainable inverse-decoder are patent-pending and not disclosed in this page or the public whitepaper. The intended integration surface for partner engineering teams is a black-box SDK with encoder and decoder behind a stable API.

Engineering-deep partner conversation?

P1–P6 are pre-registered. The Phase-J record is public. We respond within one business day.

Read full whitepaper Verify cross-vendor