Your documents. Searchable. Cited. POPIA-safe.
A focused engagement that turns your scattered knowledge — Confluence, SharePoint, PDFs, contracts, runbooks, policy decks — into a private retrieval-augmented generation system that answers questions in plain language with verifiable citations back to the source. Four to six weeks. Built on the same RAG architecture that powers sonofgraig's enterprise platform.
What ships at the end of week six.
A focused RAG engagement only succeeds if the retrieval is good. Most failed enterprise RAG projects skip evaluation, ingest the wrong documents, or never run the PII scrubber. We tackle all three by making them deliverables — not afterthoughts.
Six stages from raw doc to cited answer.
The diagram below traces a single document from your source system to a query response. Each stage adds either value (chunking, embedding, retrieval) or a guarantee (PII scrubbing, audit logging). Nothing in the pipeline is opaque — every stage logs metrics that show up in your eval dashboard.
{org_id}_{kb_id}. AES-256 at rest.Connect once. Sync forever.
Source ingestion runs on Airbyte — the same open-source data integration platform our enterprise platform uses. Set up the connection once, choose a sync schedule, and the index stays fresh. The standard engagement covers up to four source connections; additional sources are quoted at R8K each.
Standard engagement: up to 4 source connections. Additional sources +R8,000 each. Custom REST connectors with non-trivial pagination or auth flows are scoped per case.
Source-cited answers, not hallucinated narratives.
A working RAG system is judged by its citations as much as its answers. The mock query below shows what a typical response looks like in your knowledge base — an answer in plain language, three retrieved chunks ranked by similarity, and links back to the source documents. The same shape ships in the embeddable chat widget.
Four phases. Four to six weeks.
Every phase ships a tangible deliverable. Nothing is left for "later". The cadence below is the standard plan; if your document set is unusually large or your source systems require new connectors, the build phase can extend by up to a week.
Exactly what's in. Exactly what's not.
Fixed-scope means we have to be explicit about boundaries. The lists below are the standard inclusions and exclusions for the R55,000 starting price. Anything in the right column can be quoted as a separate engagement — or rolled into a sonofgraig platform subscription on conversion.
- Discovery workshop, document inventory, classification, POPIA risk register
- Up to 4 source connections via Airbyte (Drive, Confluence, SharePoint, Notion, etc.)
- Up to 5,000 documents ingested and indexed
- PII scrubber configured and tested for SA personal-information patterns
- Chunking strategy selection & embedding-model selection
- Per-tenant Qdrant collection deployed in af-south-1
- Hybrid retrieval (semantic + BM25) with source-cited answers
- Document-level access control mapped to your user groups
- Ragas evaluation report — 4 metrics, agreed thresholds
- Production query API with rate limiting
- Embeddable chat widget — styled, system prompt configured
- Immutable query-level audit log with PostgreSQL deletion rules
- POPIA Section 19 evidence pack and runbooks
- Two knowledge-transfer sessions with your team
- 30 days of priority support from go-live
- Custom AI agent built on top of the knowledge base — AI Agent Implementation
- Multiple knowledge bases — +R20K each
- Source systems beyond the standard four — +R8K each
- Document volumes above 5,000 — quoted at scoping based on average size
- Custom UI surface beyond the embeddable chat widget
- Domain-specific embedding-model fine-tuning — Fine-Tuning Ops product
- Source-system schema changes or upstream data engineering work
- Long-running operational support beyond the 30-day window
- LLM token costs — metered to your provider account
- Information Officer outsourcing — you retain that role
Numbers, not adjectives. Ragas, on a curve.
Retrieval quality is the deliverable. We agree thresholds during scoping, run the test query set after every change to the index, and ship the eval report on go-live. The four metrics below are the ones we always track. The bar fills shown are typical — your real numbers depend on your corpus and will be reported with confidence intervals.
Bar values shown are illustrative typical results. Acceptance thresholds are agreed in writing during scoping — not after delivery.
Compliance designed in — not bolted on later.
POPIA-compliant RAG is not significantly more complex to build than a non-compliant version — provided compliance is treated as a design constraint from week one. The four cards below are the controls every engagement ships with, and the evidence pack you receive at handover.
Independent reference: sonofgraig publishes a complete POPIA compliance statement and a long-form engineering essay on POPIA-compliant RAG. The architecture in this engagement matches both documents — not a simplified version of either.
An opinionated stack. Open source where it matters.
We do not invent the engine; we invest where the value is. Every component below is production-grade open source or a SaaS service we consciously chose not to rebuild. You inherit the same engineering decisions our enterprise platform was built on — and you keep the source code.
One number. No hourly surprises.
sonofgraig service projects are deliberately simple to procure. The price is the price. Scope is fixed before contracting. Variations are quoted in writing and signed before any additional work is performed.
Questions procurement, legal and engineering ask.
If your team is preparing for a vendor review or a board sign-off, the answers below cover most of what gets raised. Anything else, your account team can route to engineering directly.
What is the difference between this and AI Agent Implementation?
Is the R55,000 fixed, or just a starting figure?
What happens to LLM and embedding token costs?
Where does our data physically live during the engagement?
Can users only see documents they are authorised to see?
What chunking strategy do you use?
Which embedding model do you use?
text-embedding-3-small (OpenAI) for the cost-efficient default with strong general performance. A locally-hosted open embedding model (e.g. BGE) when data residency or cost demands it — this runs entirely inside af-south-1 with zero external API calls. The choice is made during scoping with the residency profile of your data in mind.What does the 30-day post-delivery support cover?
Who owns the source code at handover?
Do you sign Data Processing Agreements?
Are sonofgraig B-BBEE certified and CIPC registered?
Book a 30-minute scoping call.
A senior solutions engineer joins, we step through your document sources and target use case, identify whether it fits the standard scope, and confirm what your final fixed price will be. No commitment until contract signature.