Architecture & Governance Standards
Technical specifications for Ingestion Protocols, Forensic Linkage Logic, and Audit Data Models. This document is designed for engineering leadership, DevOps, and security review: high precision, auditable linkage, and zero inference in the critical path.
Déjà System Specification & Architecture Reference
Version: 1.0 (Public) • Status: Active • Classification: Technical Documentation
The mathematical principles behind the Deterministic Engine.
Modern incident resolution tools largely rely on Probabilistic Models (LLMs, Vector Embeddings, Cosine Similarity) to link errors to code. These models operate on a confidence interval: they guess the relationship between a stack trace and a pull request based on semantic similarity.
Déjà rejects this approach for production infrastructure. In the domain of root cause analysis, a "high confidence guess" (e.g., 90% probability) is functionally equivalent to noise. If an operator cannot trust the link implicitly, they must verify it manually. If manual verification is required, the automation has failed.
Match (1): The relationship is mathematically proven via strict equality, hash collision, or direct Git ancestry.
No Match (0): The relationship cannot be proven.
We do not offer "likely" matches. We offer "proven" matches or silence.
This architectural philosophy is codified in US Patent Application 19/430,349, specifically Claim 4:
- Zero Hallucination: The system cannot invent a link that does not exist.
- Zero Data Leakage: Source code is never passed to a third-party generative model for processing.
- Auditability: Every match can be traced back to a specific line of code, commit hash, or error signature.
The primary failure mode of "AI Ops" tools is Alert Fatigue. By surfacing "likely" root causes, these tools force engineers to spend cognitive energy debunking false positives. Déjà optimizes for Signal-to-Noise Ratio over Recall.
Input: Error: NullPointer in auth.ts
System Action: "This looks 85% similar to a bug in login.ts."
Result: The engineer investigates login.ts, realizes it's unrelated, and loses trust in the tool.The Déjà Approach (Deterministic):
Input: Error: NullPointer in auth.ts
System Action: No exact match found in the Knowledge Graph.
Result: Silence. The engineer debugs manually. Trust is preserved for the next incident where a proven match is available.
A core challenge in institutional memory is File Identity: keeping track of a file as it moves, renames, or refactors over time.
- Probabilistic Approach: semantic vectors remain "close" after moves; risk: false positives when distinct files share similar logic.
- Deterministic Approach: Déjà uses Git Ancestry Parsing (Lane 1 Ingestion).
- Move Detection: if git log --follow reports rename (R100), we link the new path to the old path as an Alias in the Knowledge Graph.
- Copy-Paste-Delete boundary: if Git history is broken, Déjà treats it as a new unrelated file. Without an explicit Git link, assuming identity requires guessing. We do not guess.
Effective institutional memory requires two opposing capabilities: the ability to recall events from years ago (Deep Context) and the ability to react to a crash happening right now (Low Latency).
To achieve this without sacrificing performance, Déjà decouples ingestion into two distinct, parallel architectures: Lane 1 (Historical) and Lane 2 (Real-Time). Both lanes feed into a shared Normalization Engine, ensuring that a bug fixed two years ago generates the exact same fingerprint as a bug occurring today.
- Depth: configurable scan of the default branch history (Standard: 730 days / 2 years).
- Commit Traversal: walks the Git tree in reverse chronological order.
- Artifact Extraction: identifies Resolution Artifacts (PRs referencing issues: "Fixes PROD-123", "Resolves #402").
- Diff Analysis: parses file diffs to map which files changed to fix which issues.
- Performance: high-throughput batch lane; completeness over latency.
- Ingestion Source: HTTP webhooks from monitoring tools.
- Latency SLO: <120ms from receipt to indexing.
- Hostile input stance: applies the Entropy Gate to reject un-hydrated/minified traces before indexing.
- Sanitization: strips environment-specific prefixes (e.g., /var/www/, webpack://) to match canonical paths.
The critical innovation is that normalization happens downstream of the lanes. If Lane 1 reads src/utils/auth.ts from Git history, and Lane 2 receives webpack:///./src/utils/auth.ts from Sentry, they must resolve to the same identifier.
Traditional incident response is isolated: a bug is fixed in Service A, but the knowledge remains trapped in that repository. Déjà implements Predictive Immunity (Claim 15): invert the workflow from "Crash → Fix" to "Fix → Broadcast."
The immunity engine operates on a Dependency Graph constructed during Lane 1. It parses manifests (package.json, go.mod, Cargo.toml, pom.xml) to map producer/consumer relationships.
- Trigger: Validation verifies a fix in a producer repo (auth-client-lib fixes TokenRotation leak).
- Traversal: query consumers that import auth-client-lib.
- Version match: check if consumers run a vulnerable version (pre-fix).
- Broadcast: send predictive alert referencing the verified fix.
| Status | Definition | Meaning |
|---|---|---|
| Exposed | Service runs code containing a known verified defect pattern or imports a vulnerable dependency version. | No remediation detected. |
| Patched | Service applied the specific code change linked to the verified solution (Git history confirms fix commit or version bump). | Safe state via explicit patching. |
| Immunized | Service moved from Exposed to Patched without ever generating an incident event. | Outage prevented; drives "immunization rate" KPI. |
How we turn raw, noisy stack traces into stable fingerprints.
The engine will not process, index, or store stack traces that reference minified or obfuscated code. Un-hydrated traces are treated as "Data Pollution." If a payload arrives without Source Map expansion (Sentry/Datadog), it is rejected at the edge. We do not guess the original source path.
- Minified artifacts: files matching short randomized patterns (e.g., [a-z]{1,2}\.js) like a.js.
- Hashed filenames: [a-f0-9]{8,}\.js (chunked bundles: main.8f9a21b4.chunk.js).
- Vendor artifacts: paths strictly within node_modules/, webpack/runtime/, [native code] (unless whitelisted).
- Verify Source Maps are generated during CI/CD builds.
- Upload Source Maps to the provider before deployment activates.
- Confirm release version (SHA) matches the uploaded artifacts.
Raw stack traces contain many irrelevant frames. Fingerprinting the top frame blindly leads to unstable matches. The engine traverses frames F0 → Fn and applies filters:
- Skip vendor/middleware: ignore node_modules/, vendor, stdlib internals.
- Skip minified/garbage: ignore frames rejected by the Entropy Gate.
- Anchor selection: stop at the first hydrated User Code frame.
Shared utilities (e.g., src/utils/*) can cause under-segmentation. If the anchor matches a generic utility pattern (*/utils/*, */lib/*, */helpers/*, */shared/*), the engine "peeks" at the caller frame and uses compound anchoring: Caller + Anchor.
Identity is defined by a mathematical collision. The canonical fingerprint is generated using SHA‑256:
Fingerprint = SHA256(AnchorFrame + CallerContext + ExceptionInvariant)
- Anchor Frame (Location): normalized file path + line (e.g., src/payment/Checkout.ts:42).
- Caller Context: included only if peeking is active (utility anchor).
- Exception Invariant: sanitized error template (UUIDs/timestamps removed).
Raw errors contain high-entropy segments (UUIDs, timestamps, memory addresses). The engine replaces dynamic values with placeholders via regex sanitizers prior to hashing to ensure stable grouping.
To pass the Entropy Gate, stack traces must be fully hydrated before they reach Déjà. A common CI/CD misconfiguration reverses the order: errors are reported before maps are processed, leading to minified garbage being rejected.
Correct order: Build → Upload Source Maps (wait success) → Deploy.
Déjà is an infrastructure metadata tool, not a user analytics tool. While Déjà drops payloads containing obvious PII patterns, best practice is to scrub upstream.
Sentry.init({
beforeSend(event) {
if (event.request && event.request.data) {
event.request.data = "[Redacted]";
}
if (event.user) {
delete event.user.email;
delete event.user.ip_address;
}
return event;
},
});Endpoint: POST https://ingest.deja.ai/v1/events
Auth: Authorization: Bearer <INGESTION_KEY>
The in_app boolean is critical for Anchor Frame logic.
{
"event_id": "uuid-v4",
"timestamp": "ISO-8601",
"platform": "node | python | go",
"release": "git-sha-hash",
"exception": {
"type": "ErrorType (e.g., TypeError)",
"value": "Sanitized Message Template",
"stacktrace": {
"frames": [
{
"filename": "src/utils/auth.ts",
"function": "validateSession",
"lineno": 42,
"in_app": true
}
]
}
}
}- Scope: metadata/read, contents/read, pull_requests/read (provider dependent).
- Access required: commit hashes, PR titles/bodies, file diffs, git log history, file paths.
- We do not access: .env files, repo settings/admin, unrelated assets; no full repo cloning.
- Scope: contents/write + pull_requests/write for "Auto‑Create Patch" workflows.
- Behavior: never pushes to protected branches; opens PRs on feature branches (e.g., deja/patch-*) for human review.
Purpose: manually mark internal libraries (e.g., src/middleware/logger.ts) as non-actionable to force the engine to look further up the stack. Sentinels prevent "Black Hole" incidents.
normalization:
sentinels:
- "src/middleware/**"
- "src/utils/http-client.ts"Purpose: force two distinct fingerprints to merge (e.g., deprecating an old error type that is functionally identical to a new one). Use Case: "Day 100" optimization to clean up the timeline.
grouping:
- target: "src/auth/NewLogin.ts"
aliases:
- "src/legacy/OldLoginController.js"
- target_exception: "StandardError"
alias_exceptions:
- "LegacyError"
- "CustomDbError"Day 100: merge duplicates caused by refactors and broken path ancestry using grouping rules.
Purpose: regex rules to strip non-standard build artifacts or dynamic data from error messages that default sanitizers miss. This is the escape hatch for messy invariants.
sanitizers:
- pattern: "TX-[A-Z0-9]{4}-[A-Z]+"
replacement: "<tx_id>"
- pattern: "shard-[a-z]+-[a-z0-9]+"
replacement: "<shard>"- Be specific: avoid broad patterns that strip meaning.
- Test regex: validate against real logs before deploying.
- Order matters: custom sanitizers apply after built-ins but before hashing.
- Row-Level Security: PostgreSQL RLS policies scope all queries to workspace_id.
- Cache/queue namespacing: keys prefixed by tenant scope to avoid cross-tenant processing.
- Data residency: US-East-1 default; optional single-tenant EU deployments (eu-central-1) for residency needs.
- Retention: knowledge graph metadata retained for workspace lifetime; raw webhooks retained 7 days for replay/debug then deleted; diff contexts bounded (e.g., 30 days) then purged.
Contractual commitment: Customer Data is never used to train, fine-tune, or improve foundation models for Déjà or any third party. Technical implementation: the deterministic engine uses algebraic hashing, not probabilistic inference.
- No neural weights: no model to learn from your code.
- Ephemeral processing: diffs processed in memory to generate hashes, then discarded.
- Metadata formats: store file paths/line numbers/invariants, not contiguous corpora for LLM training.
- We store: file paths, commit hashes, function signatures, line numbers, bounded diff fragments.
- We do not store: full file contents, .git directories, full history, unrelated assets.
Déjà does not trust human intent (merging). It trusts system behavior (telemetry). The Rate Gate compares a fingerprint's error rate across a pre-merge window and a post-merge window.
ΔE = (Rate(W_pre) - Rate(W_post)) / Rate(W_pre) * 100 Rate(W) = fingerprinted_errors / total_traffic
- Threshold: default 95% (configurable).
- Soak Period: default 24 hours; must remain down for full cycle to be Verified.
- Traffic normalization: errors per request, not raw counts.
The Revert Penalty monitors candidate solutions for 72 hours post-merge. Reverts are the strongest signal of a failed solution.
- Detection: explicit reverts ("Revert ..."), force pushes removing indexed commits, hard resets.
- Penalty: apply Confidence Penalty (-100) to that solution signature.
- Outcome: incident re-opens; solution marked "Failed Attempt" to prevent repetition.
Composite Score (0–100) quantifies reliability: Score_total = S_location + S_invariant + S_validation.
- Location: +40 anchor match; +10 caller context match.
- Invariant: +30 exception type + sanitized message match.
- Validation: +20 prior rate gate pass; -100 if ever reverted.
| Tier | Threshold | Behavior |
|---|---|---|
| Silence | < 70% | No suggestion shown (preserves trust). |
| Possible Match | 70–90% | Displayed but requires human confirmation. |
| Verified Match | > 90% | Auto-displayed as known root cause. |
- Ingesting: signal received, awaiting normalization (<120ms), then dropped or normalized.
- Open: active incident, no known fix detected.
- Candidate Detected: PR merged touching anchor frame file(s).
- Validating (Soak): rate gate active; pauses during low traffic.
- Verified: rate gate passed; becomes institutional memory.
- Regressed: verified fingerprint reappears ("Zombie Bug").
- Immunized: predictive alert patched before crash occurred.
Because "it just works" is a lie in distributed systems.
Cause: stack trace contained minified artifacts; Entropy Gate rejected. Fix: upload source maps before deploy; confirm release tags.
Cause: Sentinel rules too broad, skipping true anchor frame. Fix: refine patterns (avoid swallowing business logic).
Cause: generic utility anchor + dynamic message bypassed sanitizers. Fix: add sentinel for utility or add custom sanitizer to stabilize invariants.
Cause: anchoring on a massive generic file (e.g., app.ts) collapses context. Fix: add file to Sentinels; force deep-linking; long-term refactor.
Cause: exception messages are too generic (e.g., "Failed"). Fix: throw typed errors with descriptive messages to increase invariant entropy.
You now have a comprehensive System Specification covering physics, architecture, configuration, security, validation, lifecycle, and troubleshooting.