Architecture
Five components: a WordPress bridge, a sync worker, one Postgres with pgvector, a Fastify API that runs the retrieve → gate → generate → verify pipeline, and an embeddable chat widget. WordPress remains the source of truth; the RAG store holds embeddings, metadata, and verbatim scripture only.
Fig. 1 — System architecture. Scripture and disclaimers are enforced inside the API pipeline (stages 4–5), after the model and before the client.
Component summary
| Component | Technology | Responsibility |
|---|---|---|
| WP bridge | Small additions to the existing knowledgebase plugin (PHP) | Delta endpoint (modified_after, paginated) + HMAC-signed save_post / trashed_post webhook. Read-only; WP stays the source of truth. |
| Sync worker | TypeScript (Node 22), Docker | Bulk load, 15-min incremental poll with content-hash gating, nightly reconciliation. Cleaning, chunking, scripture extraction, embedding calls. |
| Storage | Postgres 16 + pgvector (halfvec), PgBouncer | Two isolated tenant schemas, each with its own HNSW + GIN indexes. RLS as defense in depth. Holds vectors, metadata, verbatim scripture — not full mirrored content beyond cleaned text. |
| API | Fastify + zod (fastify-type-provider-zod), OpenAPI 3.1, Anthropic TS SDK | The 5-stage answer pipeline; typed-event SSE; abuse stack; per-tenant config (system prompts, disclaimers, thresholds). |
| Widget | Vanilla JS, ~15KB, tolerant reader | Embeds in both WP sites; fetch()+ReadableStream SSE; renders segments, citations, disclaimer; suggested-questions panel. |
| Sidecars | Haiku 4.5 (API), bge-reranker-v2-m3 (self-host) | Safety classification, MPA groundedness judging, cross-encoder reranking. |
Design principles
- Grounding is structural. Scripture verbatim-ness, citations-from-retrieved-set, and the MPA disclaimer are enforced by the data layer and DTO validation — never delegated to the prompt.
- Tenant isolation below the application. Separate schemas with independent indexes; a forgotten WHERE clause cannot leak medical content into the Islamic-guidance brand.
- One store. Vectors live next to metadata in Postgres — no dual-write consistency problem, no second database to operate. Sized for 170k articles / ~350k vectors without structural change.
- Fail closed. Every uncertainty path (low retrieval score, hash mismatch, rail failure, timeout) degrades to the templated "not covered + closest articles" response — never to an ungrounded answer.
Data & Ingestion
Three ingestion modes share one cleaning-and-chunking core. Re-embedding is gated on a SHA-256 of the cleaned text, because WordPress bumps modified on metadata-only saves.
Fig. 2 — Ingestion: bulk load, 15-minute incremental poll with hash gating, nightly reconciliation.
Cleaning & chunking rules
- Publish-only: status='publish' — drops MPA's 14,531 drafts from the index by construction.
- Strippers: [elementor-template] shortcodes (present in nearly every article — CTA widgets, not content), Gutenberg comment markup, spacers.
- Junk denylist: tester authors and zero-count test categories found in the real exports.
- Chunking: ~450–500 tokens, ~50 overlap, on Gutenberg block boundaries; title › h2 › h3 breadcrumb prepended; verse-reference paragraph + its blockquote always kept in the same chunk; 65k-word MPA outliers split on heading boundaries.
Postgres schema (applied to kb_wc and kb_mpa)
CREATE TABLE article ( post_id bigint PRIMARY KEY, title text NOT NULL, slug text NOT NULL, url text NOT NULL, status text NOT NULL CHECK (status = 'publish'), published_at timestamptz, modified_at timestamptz NOT NULL, author_id bigint, reviewer_ids bigint[] DEFAULT '{}', -- MPA clinician credentials category_ids int[] DEFAULT '{}', clean_text text NOT NULL, word_count int, content_sha256 bytea NOT NULL, -- re-embed gate last_indexed_at timestamptz ); CREATE INDEX ON article USING btree (modified_at); CREATE INDEX ON article USING gin (category_ids); CREATE TABLE chunk ( id bigserial PRIMARY KEY, post_id bigint NOT NULL REFERENCES article ON DELETE CASCADE, seq int NOT NULL, heading_path text, body text NOT NULL, token_count int NOT NULL, embedding halfvec(1024) NOT NULL, -- voyage-4 tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', body)) STORED, UNIQUE (post_id, seq) ); CREATE INDEX ON chunk USING hnsw (embedding halfvec_cosine_ops) WITH (m=16, ef_construction=64); CREATE INDEX ON chunk USING gin (tsv); CREATE TABLE scripture ( id bigserial PRIMARY KEY, post_id bigint NOT NULL REFERENCES article ON DELETE CASCADE, chunk_id bigint REFERENCES chunk, seq int, ref_label text, -- "Surah Al Tawbah (9), Verse 128" — from the PRECEDING paragraph quote_html text NOT NULL, -- byte-for-byte wp-block-quote HTML sha256 bytea NOT NULL ); CREATE TABLE ingest_run ( id bigserial PRIMARY KEY, kind text NOT NULL, -- 'bulk' | 'poll' | 'reconcile' watermark timestamptz, upserted int, deleted int, re_embedded int, started_at timestamptz DEFAULT now(), finished_at timestamptz, error text );
Retrieval & Generation
One hybrid SQL query per request — no separate narrowing layer. A calibrated answerability gate sits before the LLM: below threshold, the model is never called and the API returns the templated "not covered + closest articles" response.
Fig. 3 — Request lifecycle. Two exits before the model: the safety classifier (step 2, MPA) and the answerability gate (step 5).
The hybrid query
WITH vec AS ( SELECT id, post_id, row_number() OVER (ORDER BY embedding <=> $query_vec) AS r FROM kb_mpa.chunk ORDER BY embedding <=> $query_vec LIMIT 50 ), fts AS ( SELECT id, post_id, row_number() OVER (ORDER BY ts_rank_cd(tsv, q) DESC) AS r FROM kb_mpa.chunk, websearch_to_tsquery('english', $query_text) q WHERE tsv @@ q LIMIT 50 ) SELECT id, post_id, SUM(1.0 / (60 + r)) AS rrf_score FROM (SELECT * FROM vec UNION ALL SELECT * FROM fts) fused GROUP BY id, post_id ORDER BY rrf_score DESC LIMIT 20;
- Pipeline numbers: top-50 per arm → RRF fuse (k=60) → top-20 → cross-encoder rerank → 6–8 chunks, max 2–3 per article into generation. Categories are an optional explicit facet only — never an inferred pre-filter.
- "Not covered" gate: reranker-score threshold calibrated per tenant on ~200 labeled in/out-of-domain queries; a structured insufficient_context flag from the model is the second gate.
- Latency budget: embed ~80ms · hybrid query ~30ms · rerank ~120ms · generation 1.5–3s streamed → p95 < 4s, first token < 1.5s on the streaming path.
The verbatim-scripture pipeline
Fig. 4 — Verbatim scripture is a data-layer guarantee: 7,898 Wise Compass articles (90%) contain Gutenberg blockquotes; none pass through the model.
DTOs & Streams
Contracts defined once as zod schemas → OpenAPI 3.1 → generated client types. Versioned at /v1, additive-only — cached WordPress embeds cannot be force-upgraded.
const ChatRequest = z.object({ message: z.string().min(1).max(1000), conversationId: z.string().uuid().optional(), // multi-turn (phase 2) stream: z.boolean().default(true), }); // tenant resolved SERVER-SIDE from X-Site-Key + Origin — never from the body const Segment = z.discriminatedUnion('type', [ z.object({ type: z.literal('prose'), text: z.string() }), z.object({ type: z.literal('scripture'), html: z.string(), reference: z.string(), articleId: z.number() }), // server-substituted, hash-verified ]); const Citation = z.object({ articleId: z.number(), title: z.string(), url: z.string().url(), author: z.string().nullable(), reviewer: z.string().nullable(), // MPA credentials surfaced score: z.number(), }); const ChatResponse = z.object({ requestId: z.string().uuid(), status: z.enum(['answered', 'not_covered', 'safety_redirect']), segments: z.array(Segment), citations: z.array(Citation), // non-empty + from retrieved set, or API returns fallback closestArticles: z.array(Citation), // populated when not_covered disclaimer: z.string().nullable(), // NON-NULLABLE for MPA — server-populated safetyFlags: z.array(z.enum(['emergency','crisis','dosage_seeking'])), usage: z.object({ inputTokens: z.number(), outputTokens: z.number() }), }); // Errors: RFC 9457 application/problem+json
Streaming — typed events, tenant-specific release policy
Fig. 5 — SSE event order per tenant. Raw model deltas never reach the client; the widget shows a "checking sources" state during the MPA buffer.
- Transport: POST /v1/chat with Accept: text/event-stream; same endpoint with stream:false returns full JSON. Widget uses fetch()+ReadableStream (EventSource can't POST).
- Connection discipline: PgBouncer transaction mode — retrieve, release the DB connection, then open the Anthropic stream. 15s heartbeats, 30s hard timeout → "not covered" fallback, per-IP concurrent-stream caps, X-Accel-Buffering: no.
API surface
| Endpoint | Purpose |
|---|---|
| POST /v1/chat | Answer path — SSE + JSON |
| GET /v1/articles/:id | Citation hover-preview for the widget |
| POST /v1/feedback | Thumbs up/down keyed by requestId — feeds the eval set |
| POST /v1/ingest/webhook | WP save_post hook, HMAC-signed |
| GET /v1/healthz · GET /v1/admin/sync-status | Liveness + ingest watermarks/drift |
AI Models
| Role | Model | Price /MTok (in / out) | Why |
|---|---|---|---|
| Answer generation | claude-sonnet-5 | $3 / $15 intro $2 / $10 → 31 Aug 2026 | Best grounding-instruction adherence per dollar. Static system prompt prompt-cached — reads at ~0.1×. |
| Safety classifier | claude-haiku-4-5 | $1 / $5 | Pre-retrieval emergency / crisis / dosage routing, fixed signposting templates. ~650 tok/call. |
| MPA groundedness judge | claude-haiku-4-5 | $1 / $5 | Sentence-level entailment vs retrieved chunks + deterministic number+unit byte-match rule. |
| Escalation tier (optional) | claude-opus-4-8 | $5 / $25 | Only if eval shows Sonnet gaps on multi-article synthesis. Not in the base budget. |
| Embeddings | voyage-4 · 1024-dim | $0.06 /MTok 200M free tokens | Anthropic's recommended partner. Strong on Islamic transliteration + medical vocabulary. Matryoshka → 512-dim later without re-embedding. Corpus embeds for $0 under the free tier. Fallback: OpenAI text-embedding-3-small behind a provider-agnostic interface. |
| Reranker | bge-reranker-v2-m3 (self-host) or Voyage rerank | ≈ $0 | Its calibrated score is the "not covered" gate. |
Token budget per query (single-turn)
| Component | Tokens | Notes |
|---|---|---|
| System prompt (grounding rules, placeholder protocol, tenant config) | ~1,300 | Prompt-cached — ~0.1× after first request per 5-min window |
| Retrieved context (6–8 chunks, max 2–3/article) | ~3,000 | Fresh input |
| Scripture metadata + question + formatting | ~250 | Fresh input |
| Total input | ~4,550 | of which ~1,300 cached |
| Output | ~450 | Placeholders keep scripture out of output tokens |
| Query embedding (voyage-4) | ~30 | ≈ $0.000002 — noise |
| Multi-turn follow-ups (phase 2) | +30–60% input | History rides in messages; cache absorbs most of it |
Per-Query Price
| Generation model | Per query | 5k q/mo | 10k q/mo | 30k q/mo |
|---|---|---|---|---|
| claude-haiku-4-5 | $0.0056 | $28 | $56 | $168 |
| claude-sonnet-5 (intro, → Aug 2026) | $0.0112 | $56 | $112 | $336 |
| claude-sonnet-5 (standard) | $0.0167 | $84 | $167 | $501 |
| claude-opus-4-8 | $0.0279 | $140 | $279 | $837 |
Basis: ~4,550 input (1,300 cached at 0.1×) + ~450 output. Add-ons: MPA safety rails ≈ +$0.003/query on the medical tenant only; multi-turn ≈ +40% input. Three dampeners stack: prompt caching (30–50% of input spend), exact-match Redis response cache (repeat queries → $0), and the "not covered" gate (gated queries skip the LLM, ~$0.0002).
| One-time item | Cost | Notes |
|---|---|---|
| Full-corpus embedding — ~24M tokens incl. overlap | $0 – $1.43 | voyage-4 free tier → $0; list $1.28–1.43 |
| Full re-embed (model or chunking change) | < $1.50 | Sub-hour, sub-$2 — never a reason to defer a fix |
Growth Tier & Expected Monthly Costs
The Growth tier buys three things the Lean tier (~$31/mo, single instance) doesn't have: a redundant API pair behind a load balancer, an isolated staging environment for safe releases, and bigger DB compute to keep the HNSW index fully in RAM as the corpus grows toward 170k articles.
Fig. 6 — Growth-tier topology, ~$119/mo infrastructure. Every component EU-resident.
Environments & promotion flow (staging areas)
Fig. 7 — Three staging areas. Dev costs nothing (local + free Neon branches); staging idles to near-zero when unused; prod carries the redundancy.
Growth-tier infrastructure — line items
| Component | Plan · Region | $/mo |
|---|---|---|
| Prod database — Postgres 16 + pgvector, both tenant schemas, PITR, failover | Supabase Pro + Small compute · Frankfurt | 30 |
| Staging database — full corpus copy, scale-to-zero | Neon Launch · Frankfurt | ~8 |
| Prod API — redundant pair, zero-downtime deploys, autoscaling | 2× Render Standard · Frankfurt | 50 |
| Prod sync worker | Render Starter · Frankfurt | 7 |
| Staging API + worker | 2× Render Starter · Frankfurt | 14 |
| Cache + rate-limit state | Upstash Redis fixed 250MB · eu-central-1 | 10 |
| CDN / WAF / TLS · uptime · errors · tracing | Cloudflare Free · UptimeRobot · Sentry free · Langfuse self-host | 0 |
| Infrastructure total | ~119 | |
Development-phase AI spend (one-time, during the ~6-week build)
| Item | Est. cost | Notes |
|---|---|---|
| Corpus embedding + ~5 re-embeds during chunking iteration | $0 – $8 | voyage-4 200M free tokens absorb ~9 full re-embeds; list price shown |
| Retrieval calibration + golden-set iterations (~4–5k Sonnet queries) | $60 – 90 | Threshold tuning on ~200 labeled queries/tenant × iterations |
| CI regression runs during build (~100 runs × ~50-query suite) | $60 – 90 | Mix of Haiku (rails) and Sonnet (generation) calls |
| Safety-rail tuning (Haiku classifier + entailment judge, ~5k calls) | ~$5 | Fixed-template routing verified against sensitive probes |
| AI-assisted development tooling (Claude Code, build window) | $200 – 400 | The lever behind the 6-week timeline; ~2 months of a Max-tier plan or metered API equivalent |
| Total development-phase AI spend | ~$325 – 590 | One-time; separate from the build fee |
Expected all-in monthly — Growth tier
| Line | 10k q/mo | 30k q/mo | 60k q/mo |
|---|---|---|---|
| Infrastructure (above) | $119 | $119 | ~$135DB compute step-up |
| Prod generation — Sonnet 5 std, incl. MPA rails, ~25% cache/gate savings | ~$175 | ~$520 | ~$1,040 |
| Staging LLM — smoke + regression traffic | ~$15 | ~$15 | ~$20 |
| Embedding — incremental re-index traffic | ~$0 | ~$0 | ~$1 |
| Total expected monthly | ~$309 | ~$654 | ~$1,196 |
At Sonnet 5 intro pricing (through Aug 2026), subtract ~33% from the generation line. The per-tenant daily spend circuit breaker converts these projections into a contractual ceiling: when tripped, the tenant degrades to retrieval-only "closest articles" mode instead of overspending. Lean-tier reference for comparison: ~$31/mo infra, single instance, schemas-as-staging — appropriate until launch traffic is proven.
Risk Register
Every risk carries a concrete, testable defense. The four criticals are release-blocking.
Defense — scripture never passes through the model: ingest-time extraction (quote + preceding reference paragraph), [[SCRIPTURE:id]] placeholders, server-side byte substitution, exact-hash output validation, CI byte-equality suite that blocks deploy on mismatch.
Defense — 15-min modified_after poller with content-hash gating, explicit publish/unpublish/trash handling with cascade deletes, nightly full reconciliation; webhook for latency, polling as source of truth. Live edits indexed < 20 min.
Defense — deterministic gate on the reranker score, calibrated per tenant on ~200 labeled queries; below threshold the LLM is never called. Second gate: structured insufficient_context flag. Adversarial off-topic eval set as a release-blocking metric.
Defense — separate schemas with independent indexes (isolation by construction), RLS backstop, tenant resolved server-side from site key + Origin, per-tenant system prompts, CI cross-tenant probe (100 cross-domain queries → zero foreign post_ids).
Defense — ingestion gate: published-only, shortcode stripper, junk denylist, block-boundary chunking, per-article top-k cap (max 2–3 chunks via DISTINCT ON / MMR), chunk-count assertion.
Defense — pre-launch DPIA, consent capture in the widget, EU residency end-to-end, pseudonymized session-scoped logs (30–90-day raw-query retention), PII scrub before embedding/LLM calls, self-hosted tracing.
Defense — Cloudflare in front; per-tenant CORS allowlist; per-IP token bucket (10 req/min) + concurrent-SSE caps; 1,000-char input cap; per-tenant daily spend circuit breaker → retrieval-only mode; Turnstile after ~10 req/session; retrieved chunks framed as untrusted data, zero tools on the generation call; DTO validation that citations come from the retrieved set.
Defense — /v1 versioning with additive-only DTO evolution; tolerant-reader widget; cache-busting embed snippet; WP page-cache exclusion rules documented per site.
Compliance, Safety & QA Gates
Medical tenant (MPA) rails
- Pre-retrieval classifier (regex + Haiku 4.5): routes to emergency | crisis | dosage_seeking | normal — the first three get fixed, clinician-approved signposting templates; the LLM is never invoked.
- Extractive-first answers — "what our articles say", verbatim passages with attribution; never personalised advice, never diagnosis.
- Groundedness rail — sentence-level entailment (Haiku judge) + deterministic rule: any number+unit span (mg, ml, doses, weeks) must appear byte-identical in a retrieved chunk.
- Server-appended disclaimer — non-nullable DTO field; wording owned by the client's clinical/legal review; impossible for the model to omit.
- Buffered release — MPA answers stream only after all rails pass.
GDPR package (pre-launch)
- DPIA covering Article 9 health-query processing
- Consent capture in the widget; privacy-policy addendum
- EU residency: DB + API + inference path; SCC-backed Anthropic DPA, no-training confirmed
- Pseudonymized logs, 30–90-day raw-query retention, PII scrub pre-embedding
Release-blocking QA gates (CI, run on staging)
| Gate | Pass condition |
|---|---|
| Scripture byte-equality | 100% exact match on sampled scripture articles — zero tolerance |
| Golden query set (per tenant) | Recall@8 ≥ target on client-provided questions (15–20/site: simple · overlap · out-of-scope) |
| Adversarial off-topic set | "Not covered" precision ≥ target; zero confident answers on out-of-domain probes |
| Cross-tenant probe | 100 cross-domain queries → zero foreign post_ids |
| MPA sensitive-query suite | Disclaimer on 100%; emergency templates fire on crisis probes |
| Chunk-count assertion | Pipeline lands within tolerance of sized ~59.4k |
Observability
- Langfuse (self-hosted): per-request trace — retrieval scores, gate decision, tokens, cost — keyed by requestId
- Sentry (errors) · UptimeRobot (uptime) · structured per-tenant/day cost logging feeding the spend circuit breaker
Roadmap — AI-Accelerated Build
Six weeks to full launch with AI-assisted development (a conventional team would quote 10–12). The compression comes from generating the ingestion pipeline, DTO layer, and eval harness against the real exports already in hand — no waiting on sample data.
Fig. 8 — Build timeline. Overlapping phases run as parallel tracks; the two amber bars are client-gated, not engineering-gated.
| Phase | Deliverables | Exit criterion |
|---|---|---|
| P0 · Discovery | Blocker sign-offs: disclaimer ownership (D1), public vs logged-in (C1), API account ownership (E1); Zaheer technical briefing; WP REST + app-password access confirmed | All blockers resolved in writing |
| P1 · Ingestion & storage | Tenant schemas migrated; cleaning pipeline (shortcodes, junk, published-only); block-aware chunker; scripture extraction; full bulk load + embeddings | Chunk count ≈ 59.4k; scripture table covers 7,898 WC articles; spot-audit passes |
| P2 · Retrieval & eval | Hybrid RRF query; reranker; answerability threshold calibrated on ~200 labeled queries/tenant; golden-query CI harness | Recall@8 target on golden set; "not covered" fires correctly on out-of-domain probes |
| P3 · Generation & rails | Fastify /v1 API with zod DTOs; Sonnet 5 orchestration with cached system prompt; scripture substitution + hash validator; MPA classifier, entailment rail, server disclaimer | Scripture byte-equality suite green; MPA sensitive-query suite green |
| P4 · Widget, SSE & sync | Typed-event SSE; embeddable widget (fetch + ReadableStream); WP delta endpoint + HMAC webhook (with Zaheer); poller + nightly reconciliation; abuse stack; staging environment | Live edit on WP reflected in index < 20 min; abuse limits verified under load |
| P5 · QA & launch | Full acceptance run against client test questions (H1); DPIA + consent UI; Wise Compass launch first; MPA launches after clinical sign-off of disclaimer + blocklist | All six CI gates green; client sign-off per tenant |
Investment & Quote
One fixed price for the complete, launched system — both knowledge-base assistants, live on your sites. No hourly billing, no surprises.
Fixed-price build · one-time
$9,750
Design, build, testing, and launch of both assistants — delivered end to end.
⏱ 6–7 weeks to launchEverything included
- Two separate, isolated assistants — Wise Compass & My Patient Advice
- Answers drawn only from your published articles, each linked to its source
- Quran & Hadith reproduced word-for-word, never paraphrased
- Medical disclaimer & safety guardrails on the health assistant
- Embeddable chat widget for both WordPress sites
- Automatic sync — new and edited articles picked up continuously
- EU-hosted for GDPR · staging & production · acceptance-tested against your own questions
Fixed-price scope. Optional later enhancements — multi-turn conversation, Arabic-query tuning, high-availability — are quoted separately as Phase 2.
Payment milestones
Kick-off, discovery sign-off, and secure ingestion of both knowledge bases into the search index.
The assistant answers live from your content with correct source links — demonstrated to you before release.
All quality & safety gates green — scripture accuracy, medical disclaimer, "not covered" handling — live on both sites.
After launch — running costs
| Item | Basis | Typical |
|---|---|---|
| Cloud infrastructure — EU-hosted database, API & sync | Fixed monthly, accounts in your name | ~$119/mo |
| AI usage — answering questions | Pay-as-you-go, ~$0.01–0.02 per question, with a spend cap you set | usage-based |
| Support & maintenance | Optional retainer — monitoring, updates, priority fixes | from $300/mo |
Open Items
Must resolve before build
- Blocker Corpus discrepancy — brief claims 100k+/70k+; real exports hold 8.7k/34.5k (19,977 MPA published). All costs here use the real corpus; confirm whether exports are partial.
- Blocker D1 — medical disclaimer wording from the client's clinical/legal review; we inject it, we don't author it.
- Blocker C1 — public vs logged-in per site: drives abuse budget and MPA liability posture.
- Blocker E1 — API account ownership (Anthropic + Voyage): client-owned or agency-managed with pass-through billing.
- High B1/B2 — WP REST access + webhook: app passwords on both sites; Zaheer implements the delta endpoint and save hook.
- High H1 — test questions: 15–20/site incl. 3–5 verbatim-scripture and 3–5 sensitive-medical probes — the backbone of the QA gates.
- High D4 — MPA topic blocklist (dosages, mental-health crisis, emergencies) needs clinical sign-off.
Working assumptions
- WordPress remains the source of truth; the RAG store holds embeddings + metadata + verbatim scripture only.
- English-dominant queries at launch; Arabic input works via the multilingual embedder but is not quality-guaranteed until eval'd (C7).
- Single-turn Q&A at launch; multi-turn is a phase-2 flag already present in the DTO.
- Scales to 170k articles / ~350k vectors with no structural change — same HNSW, bigger DB compute.
- Prices are July 2026 USD list ex-VAT; Sonnet 5 intro pricing expires 31 Aug 2026 — both figures quoted throughout.
Deliberately excluded
- Dedicated vector DB (Pinecone/Weaviate) — $600–1,200/yr for a network hop and a dual-write problem at this scale.
- Fine-tuning — grounding quality here is a retrieval problem, not a model problem.
- True multi-AZ Postgres HA — deferred until traffic justifies ~$180+/mo.