How DNAi Thinks

An engineering whitepaper on the DNAi cognitive lifecycle. Nine sections walking from the atom of meaning to a fiduciary fleet of eleven public agents, with a tamper evident receipt at every ledger boundary.

Read the architecture ↓
Section 1. The Atom

A unit of meaning, with a structure

The atomic unit of DNAi knowledge is the Cognitive Inference Unit, abbreviated CIU. The CIU is the term used in DNAi’s allowed United States patent application 19/290,471 and in the runtime ledger code. A CIU is a propositional commitment, not a free text sentence and not a token, built from a three slot structure called the NFD triad: a Name (the subject), a Form (the predicate type, drawn from a fixed set including declarative, causal, conditional, modal, temporal, and jurisdictional), and a Dharma (the function or lawful constraint the proposition asserts).

Two CIUs that share the same Name, Form, and Dharma collapse to the same content addressed identity. Identity is semantic, not just lexical. The runtime canonicalizes Name through entity resolution against the indexed knowledge base (so aspirin, acetylsalicylic acid, and ASA resolve to one Name) before computing the SHA‑512 digest used as identity and as the input to the parent hash chain in the ledger described in Section 3. Each CIU also carries a tier label (HOT, WARM, or COLD), a confidence falsifiability delta (CFΔ), and a contradiction score, and is created, promoted, demoted, or atrophied as a single addressable entity.

Brain biology offers a structural analogue at the population level rather than the single neuron level.

The CIU is the addressable digital counterpart, with deterministic identity via cryptographic hash rather than statistical immediate early gene staining.

The NFD triad. Name, Form, Dharma. The atomic semantic unit of DNAi.
MetaphoricalThe NFD triad as propositional atom. Three slot positional encoding with hash based deduplication. The molecular biology analogue (a codon as a three slot positional code) is offered as illustration only.
A Cognitive Inference Unit (CIU). The addressable knowledge cell of DNAi.
PartialEngram populations are co activated during encoding and can be reactivated to evoke memory (Josselyn and Tonegawa, Science, 2020). The biological method depends on activity dependent staining; the CIU runtime depends on a SHA‑512 content digest.
Section 2. The Embedding

A 1024 dimensional address in semantic space

A 1024 dimensional embedding vector representing one CIU.
PartialHigh dimensional, distributed, content addressable representations are an analogue of population coding in cortical and medial temporal areas. The substrates differ. The cortical pattern is sparse spiking; the embedding is a dense float vector.
A multi dimensional tensor of stacked CIU embeddings.
PartialStacked CIU embeddings form a tensor that the system reads with linear algebra. Tensor coherence over the resulting subspace is one of the gates a candidate response must pass.

Each CIU also carries a 1024 dimensional, L2 normalized vector. The runtime calls a remote text embeddings server running the mxbai‑embed‑large‑v1 family on a dedicated GPU node, and stores the resulting vector in Qdrant. The same embedder produces the query vector at routing time, so query routing and CIU storage live in one shared semantic space. The vector is the CIU’s address in semantic space; cosine similarity over normalized vectors finds neighbors across hundreds of curated knowledge collections containing more than one hundred million indexed vectors.

The cortex represents meaning as an activity pattern across many cells, with any one concept lighting up only a sparse subset of an ensemble and any one cell participating in many concepts. The 1024 dimensional embedding is the digital cousin: high dimensional, distributed, and content addressable. The two substrates differ on the details. Cortical patterns are sparse spike trains; the embedding is a dense float vector. The shared property is that meaning lives in the pattern, and not in any single coordinate.

Section 3. The Ledger

Cells linked by hash, with a verifiable chain

When two CIUs co resolve in the Epistemic Arena (Section 4), the child CIU is bonded to its parents by hash. Two cryptographic structures operate in parallel. First, each CIU is content addressed by a SHA‑512 digest over its NFD payload and stores a parent hash field that links it to the preceding CIU in its lineage. This produces a per CIU hash chain that is checkable in linear time. Second, the runtime maintains a Merkle ledger with two hemisphere roots (propositional and causal) that batches CIU digests into a binary tree with domain separated internal hashes; this produces logarithmic time inclusion proofs (sibling path to root) and an Ed25519 signed root for third party verification. We refer to the combination as the Bonsai Merkle ledger: bonsai for the bounded growth and pruning dynamics, Merkle for the inclusion proofs, ledger for the append only ordering. Modification anywhere in the structure changes a digest and breaks both the chain and the inclusion proof.

The biological inspiration is Hebbian plasticity

and its temporally precise refinement, spike timing dependent plasticity. The Merkle hash chain is the cryptographic counterpart. Plasticity is continuous, decaying, and biochemical. The ledger is discrete, immutable, and cryptographic. Both share the property of history preserving binary coupling, where a unit’s identity carries a record of who it bonded with.

A Bonsai Merkle ledger node with cryptographic parent hash chaining.
MetaphoricalSpike timing dependent plasticity is the biological analogue (Markram et al., Science, 1997). The Merkle parent hash chain is the cryptographic counterpart, and the two operate on different substrates with different invariants.
An engineering observation, scoped narrowly. Per ledger event auditability has no biological analogue. Every CIU mint, promotion, demotion, and parent link is verifiable from a single Merkle root. This is a property of the engineering, and is not a claim that DNAi reproduces or replaces biological cognition.
Section 4. The Tournament

An Epistemic Arena, ordered by selection

Every query triggers a tournament. Three streams of CIUs enter the Epistemic Arena in parallel. The retrieved stream is drawn from the curated Qdrant collections by vector search. The ephemeral stream is chunked from the inference time context. The inference stream is produced by the Resolve(P) Socratic step inside the cognition loop. Each stream is scored by authority, coherence, entailment, threshold, reranking, and diversity gates. Survivors reach the verbalization context. Losers atrophy under the epistemic thermodynamics axiom.

The structural inspiration is Gerald Edelman’s Theory of Neuronal Group Selection.

The brain develops, learns, and reasons by selectional competition between neuronal groups, with three properties: variation, differential reinforcement, and reentrant signaling. The Epistemic Arena maps the same three properties: variation is the three input streams, differential reinforcement is the merit scoring and arena gates, and reentry is the cognition loop’s recurrence. The mapping is structural, and the engineering substrate (cryptographic ledger, vector search, deterministic gates) does not reproduce biological selection itself.

The Epistemic Arena. Three streams of CIUs compete under merit scoring and a sequence of gates.
PartialEpistemic Arena and Theory of Neuronal Group Selection share three load bearing properties: variation, differential reinforcement, reentry. The arena is a deterministic gated competition over discrete, hash addressable units; biological selection runs over distributed populations of cells.
Section 5. The Live Pipeline

Seven phases, each one bounded and observable

The cognition loop ships in production today as a Server Sent Event stream. Every phase emits a named status event so the client can show progress, and every phase is wrapped in an explicit timeout so a slow upstream cannot freeze the response. The phases below are the names emitted by the live POST /api/query/stream endpoint.

Phase 1. Session
Session resolved
Auth, tenant, and per user quota are resolved before any retrieval begins.
Phase 2. Retrieval
Knowledge fan out
The Knowledge Integration Layer fans out across hundreds of Qdrant collections, bounded by KIL_RETRIEVE_DEADLINE_SEC (default 25 seconds). The completion event carries the evidence count.
Phase 3. Arena
Epistemic Arena
When the reasoning layer is enabled and the task gravity is above the configured threshold, the arena scores candidates, bounded by ARENA_COMPETE_DEADLINE_SEC (default 15 seconds), with fallback to pre arena evidence on timeout.
Phase 4. Synthesis
Verbalization start
The selected language model receives the assembled context. The current production verbalization tier uses Vertex AI Gemini, with adapter level timeouts.
Phase 5. Stream
Token stream
Token level chunks stream to the client. A heartbeat fires every 8 seconds of silence so the client sees liveness during long synthesis.
Phase 6. Sentence
Sentence boundaries
Each sentence boundary is announced so the client can render rolling output, and the inter chunk silence timer (90 seconds, client side) resets on every event.
Phase 7. Receipt
Metadata and quota
Final metadata, sanitized through an allowlist, is emitted at the end together with the per user quota state. The terminator marks the end of the stream.
Why each phase is timed and named. A pipeline that is observable end to end is a pipeline that can be governed. Each phase carries its own bound, its own emitted event, and (where applicable) its own fallback path, so a slow upstream is recoverable rather than fatal.
Section 6. The Encephalic Plate

An anatomical study of the cognition lifecycle

The video at the top of this page is the unified plate of the cognition lifecycle. The companion to this whitepaper is the encephalic plate, which renders the same architecture as a traversal across labeled cortical regions, with each phase tagged to the brain region whose function it most resembles and a strength badge on every analogy: Exact, Partial, or Metaphorical. The plate is a diagram, not an isomorphism — a study aid in the lineage of Gray’s Anatomy and the von Economo cytoarchitectonic atlases, not a claim that the system is a brain.

The encephalic plate is the first installment in a forthcoming Biological Pipelines of DNAi series — sequential, segmented physiological systems whose discipline of inputs, intermediates, and outputs offers more rigorous explanatory traction than closed-loop metaphors. The nephron, the coagulation cascade, and the hepatocyte detoxification pipeline are next.

Section 7. The Fiduciary Fleet

Eleven public agents, one shared engine, one shared ledger

DNAi runs eleven public agents on top of one shared cognition engine and one shared Merkle ledger. Each agent has its own knowledge boost weights, its own routing scope, and its own published Agent Card, and every agent uses the same retrieval, arena, synthesis, and audit subsystems. Cross agent communication runs over the A2A v1.0 protocol, with per call audit records written to the Postgres a2a_audit_log table and per CIU lineage written to the Bonsai Merkle ledger.

The system is built around a second order property: it observes its own outputs. The merit score on every CIU is computed from prior arena outcomes, the contradiction flux delta is updated as new evidence arrives, and a self critique pass re reads the engine’s own answer before commitment. The cybernetics literature has a long tradition of describing such configurations.

The runtime exposes the observation as data, and the ledger makes the act of observation itself auditable.

AshaMedical intelligence
HarleyFitness coaching
ArthaFinancial analysis
SageNutrition and wellness
PolymathMath and science verification
LyraMedical research
LeoLegal aid
MiraMarketing and growth
RenCustomer support
ArohiPractice management
RayPlatform architecture
Section 8. Honest Gaps

Where the analogy is loose, we say so

Every analogy on this page carries a strength badge. Exact means the engineering and the cited science share the same load bearing properties, scoped narrowly. Partial means the function maps even when the substrate differs. Metaphorical means the analogy is illustrative, and not a load bearing claim.

Where the analogy is metaphorical

  • NFD triad and the codon (Section 1). A codon is a three slot positional code in molecular biology; the NFD triad is a three slot positional code for propositions. The shared property is positional encoding with a fixed alphabet. The substrates have nothing else in common, and the analogy is offered as illustration.
  • Cortical mini column and the engineering module (cross cutting). Columnar organization is well documented for primary sensory cortex. The literature on universal mini columns across all neocortex is contested. DNAi’s modular CIU and per agent organization is closer to cell assembly models than to a strict mini column substrate.

Where the engineering scope is narrower than biology

  • Per event auditability (Section 3). The Bonsai Merkle ledger gives per CIU cryptographic provenance. This is an engineering property of the ledger, and is reported as such. It does not imply that biological memory is unauditable in any meaningful sense, and it does not imply that DNAi cognition exceeds biological cognition along any other dimension.
  • Selection events that are individually addressable (Section 4). Every Epistemic Arena outcome is an addressable, replayable record with a parent hash chain. Biological selection operates over populations of cells with different invariants. The engineering record is a record of engineering events, and the analogy to biological selection is structural.

Where references on this page require principal verification

  • Citation receipts. Every citation on this page has been verified against the deduplicated reference list of the corresponding manuscript. Quantitative bibliometrics (citation counts, journal ranks) are not reported on this page because they cannot be regenerated from the engineering ledger and require external verification by the principal.
Section 9. What This Is Built On

Receipts beneath the architecture

  • Patent. United States patent application 19/290,471 was allowed in October 2025 with thirty (30) original claims; the issue fee was paid in January 2026. Two United States provisional applications are in force: 63/892,152 (filed October 2, 2025; attorney docket 1766‑002USP1) and 397222‑7002P1 (filed May 1, 2026; attorney docket 1766‑003USP1). Outside counsel is Saul Ewing LLP. Conversion windows close October 2, 2026 and May 1, 2027 respectively. Specific claim coverage is determined by the file wrapper of each application. Issuance status as of May 6, 2026: pending grant publication; once issued, the granted United States patent number will be posted here and in the Asha model card.
  • Co principals. DNAi Systems is co founded by Deepan Singh, MD, FAPA and Paridhi Anand, MD, both practicing physicians.
  • Memory. Hundreds of curated knowledge collections holding more than one hundred million indexed vectors across medical, financial, legal, scientific, fitness, and nutrition domains.
  • Inference. Synthesis runs on Vertex AI Gemini for user facing responses, and on a self hosted vLLM endpoint (Qwen2.5 32B AWQ on a single high VRAM GPU) for internal cognition, CIU minting, and Resolve(P). Embeddings run on a remote Hugging Face Text Embeddings Inference server, with circuit breaker fallbacks.
  • Storage. Qdrant for vector memory, PostgreSQL for the conversation archive and audit log, Redis for ephemeral session and rate limit state. Off volume backup is scheduled to a Synology NAS over Tailscale and LAN.
  • Governance. A lawful axiom framework governs CIU minting, promotion, and demotion, with a Z-arbitration layer for axiom conflicts. The framework is the same for every public agent.
  • Interoperability. All eleven public agents are reachable via the A2A v1.0 protocol with per call audit records and a published Agent Card.
  • Privacy and regulatory posture. The Asha medical agent runs on Vertex AI Gemini under a Google Cloud project that holds Vertex AI HIPAA credentials, deployed behind a Business Associate Agreement. Asha is an information service and is not a registered medical device under FDA Software as a Medical Device (SaMD) classification; it does not diagnose, treat, or prescribe. Citation validation, abstention calibration, and a structured refusal posture are part of the cognition pipeline (Section 5). External certifications (for example SOC 2) are scoped on a tenant by tenant basis and are out of scope of this page.
  • Curation. Each agent declares its own knowledge boost weights in a versioned agent.yaml. For Asha, clinical reference (StatPearls), structured drug labels (DailyMed), and clinical guidelines carry the highest weights, with PubMed abstracts and OpenAlex top cited papers next. The weights are committed to the source repository and reviewed when ingesting new collections.
  • Model card. The Asha model card, including the cohort, the cognition pipeline contract, the audited benchmarks, and the Sacred Refusals, lives in the source repository at Docs/MODEL_CARD.md and at Docs/ASHA_MODEL_CARD.md. A summarized public version is hosted at askasha.org.
Disclosures and limits. This page describes engineering structure and intent. It is not a benchmark report. Where this page mentions a measurement, the measurement is described qualitatively and the underlying artifact is preserved internally for verification. Numerical accuracy and benchmark figures are reported separately in the audited model cards and patent filings, and are not posted here.