Cypher RAG Procedures¶

NornicDB exposes seam-aligned Cypher procedures for in-query RAG orchestration:

CALL db.retrieve({query: '...', limit: 10, ...})
CALL db.rretrieve({query: '...', limit: 10, ...})
CALL db.rerank({query: '...', candidates: [...], rerankTopK: 50, rerankMinScore: 0.0})
CALL db.index.vector.embed('...') YIELD embedding
CALL db.infer({prompt: '...', max_tokens: 256, ...})

These procedures are read-only and designed to map directly to internal contracts:

Retrieval/rerank use existing search.Service + SearchOptions.
Inference uses existing Heimdall manager Generate/Chat contracts.

Procedure behavior¶

db.retrieve
Uses existing hybrid search behavior.
Reranking is optional and follows request/config defaults.
db.rretrieve
Shorthand retrieve path for simple usage.
Automatically enables rerank only when a reranker is configured and available.
Useful when you want one-call behavior while keeping db.retrieve + db.rerank available for explicit before/after comparisons.
db.rerank
Matches Stage-2 rerank API directly (does not run retrieval).
Requires caller-provided candidate rows (for example from db.retrieve).
Becomes pass-through ranking when no reranker is configured/available.
Use rerankTopK / rerankMinScore to tune rerank behavior.
db.index.vector.embed
Embeds a text string using the configured embedding service for the current database.
Returns a vector array via YIELD embedding.
This is useful for fully manual Cypher search pipelines.
db.infer caching behavior
The procedure itself does not cache; each call invokes the configured inference manager. Caching, when applicable, is the responsibility of the inference manager / model provider.

Example¶

CALL db.retrieve({query: 'zero-trust architecture', limit: 5}) YIELD node, score
WITH node, score
CALL db.infer({
  prompt: 'Summarize this node briefly: ' + coalesce(node.content, toString(node)),
  max_tokens: 120,
  temperature: 0.0
}) YIELD text
RETURN node, score, text

CALL db.retrieve({query: 'zero-trust architecture', limit: 20}) YIELD node, score
WITH collect({id: node.id, content: coalesce(node.content, toString(node)), score: score}) AS candidates
CALL db.rerank({query: 'zero-trust architecture', candidates: candidates, rerankTopK: 20}) YIELD id, final_score
RETURN id, final_score

CALL db.index.vector.embed('zero-trust architecture') YIELD embedding
CALL db.index.vector.queryNodes('doc_idx', 10, embedding) YIELD node, score
RETURN node, score

If you use db.index.vector.embed(), pass the returned embedding array into db.index.vector.queryNodes(..., embedding) (or an inline array equivalent) for explicit pipeline control.

CALL db.infer({prompt: 'Summarize: ...', temperature: 0.0}) YIELD text
RETURN text

Lightweight Risk Notes¶

Using LLM output as data in your own explicit downstream Cypher is supported.
If you intentionally combine model-generated query text with dynamic execution procedures (for example, dynamic APOC execution), treat that as a high-risk pattern and review it carefully.
Prefer parameterized query authoring and explicit mutation logic for sensitive write paths.