Hybrid Query Benchmarks¶

This benchmark focuses on the query shape that matters most for NornicDB's positioning: semantic retrieval followed by graph expansion in the same engine.

The goal is not to claim a universal leaderboard result. The goal is to show what happens when vector search and one-hop graph traversal share one execution path instead of being stitched together across multiple systems.

Summary¶

Vector-only queries stayed in sub-millisecond to low-millisecond territory locally, depending on transport.
Vector + one-hop graph traversal added a small incremental cost locally.
Remote latency tracked client-to-server RTT, which means end-to-end latency became network-bound rather than database-bound.

Test Setup¶

Item	Value
Nodes	67,280
Edges	40,921
Embeddings	67,298
Vector index	HNSW, CPU-only
Request count	800 per query type
Query types	Vector top-k; Vector top-k + 1-hop traversal

Local environment:

Apple M3 Max
64 GB RAM
Native macOS installer

Remote environment:

GCP
8 vCPU
32 GB RAM

Local Results¶

Workload	Transport	Throughput	Mean	P50	P95	P99	Max	Allocs/op
Vector only	HTTP	19,342 req/s	511 us	470 us	750 us	869 us	1.02 ms	138,031
Vector only	Bolt	22,309 req/s	444 us	428 us	629 us	814 us	968 us	206,710
Vector + 1 hop	HTTP	11,523 req/s	859 us	699 us	1.54 ms	3.46 ms	4.71 ms	123,352
Vector + 1 hop	Bolt	7,977 req/s	1.24 ms	1.10 ms	1.97 ms	4.91 ms	6.14 ms	181,790

Remote Results¶

Client-to-server latency was about 110 ms.

Workload	Environment	P50
Vector only	Remote GCP	110.7 ms
Vector + 1 hop	Remote GCP	112.9 ms

The practical result is straightforward: once local compute for hybrid retrieval is in low single-digit milliseconds, network RTT dominates the user-visible latency budget.

Why This Matters¶

Most systems make this query shape a composition problem:

embed the query
call a vector store
move the results into a graph store or application layer
expand neighbors and shape the result there

NornicDB keeps that inside one execution engine. The benchmark does not prove every workload is constant-time, but it does show that shallow hybrid retrieval can stay tight enough locally that deployment topology matters more than extra database-side micro-optimizations.

Caveats¶

These are single-node measurements.
The dataset is not billion-scale.
Remote throughput is latency-bound, not compute-bound.
These numbers are useful for query-shape comparison, not as a blanket claim for every graph or vector workload.

Verification Queries¶

Vector-only:

curl -s -u "$NORNIC_USERNAME:$NORNIC_PASSWORD" "$ENDPOINT" \
  -H "Content-Type: application/json" -H "Accept: application/json" \
  -d '{
    "statements":[
      {
        "statement":"CALL db.index.vector.queryNodes('\''idx_original_text'\'', $topK, $text) YIELD node, score RETURN node.originalText AS originalText, score ORDER BY score DESC LIMIT $topK",
        "parameters":{"text":"get it delivered","topK":5},
        "resultDataContents":["row"]
      }
    ]
  }'

Vector + one-hop graph traversal:

curl -s -u "$NORNIC_USERNAME:$NORNIC_PASSWORD" "$ENDPOINT" \
  -H "Content-Type: application/json" -H "Accept: application/json" \
  -d '{
    "statements":[
      {
        "statement":"CALL db.index.vector.queryNodes('\''idx_original_text'\'', $topK, $text) YIELD node, score MATCH (node:OriginalText)-[:TRANSLATES_TO]->(t:TranslatedText) WHERE t.language = $targetLang RETURN node.originalText AS originalText, score, t.language AS language, coalesce(t.auditedText, t.translatedText) AS translatedText ORDER BY score DESC, language LIMIT $topK",
        "parameters":{"text":"get it delivered","topK":5,"targetLang":"es"},
        "resultDataContents":["row"]
      }
    ]
  }'