Query Analysis & Cache Security¶
How NornicDB safely analyzes and caches Cypher queries.
Overview¶
NornicDB uses a QueryAnalyzer to extract metadata from Cypher queries for caching optimization. This document explains the security model, what protections exist, and edge cases.
Architecture¶
┌─────────────────────────────────────────────────────────────────────┐
│ Query Execution Flow │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Client │───▶│ QueryAnalyzer │───▶│ Cache Check │ │
│ │ Query │ │ (Metadata only) │ │ (Read-only?) │ │
│ └─────────────┘ └──────────────────┘ └──────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Query Executor │◀───│ Cache Hit/Miss │ │
│ │ (Full parsing) │ │ │ │
│ └──────────────────┘ └──────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Storage Engine │ │
│ │ (Data access) │ │
│ └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Security Model¶
What QueryAnalyzer Does¶
The QueryAnalyzer performs lightweight keyword detection to determine:
- Is this query read-only? (cacheable)
- Is this query a write operation? (requires cache invalidation)
- What labels are affected? (for smart cache invalidation)
What QueryAnalyzer Does NOT Do¶
- NOT used for access control - permissions are checked separately
- NOT used for query validation - the executor validates syntax
- NOT a full parser - just keyword detection for performance
Threat Analysis¶
✅ Protected: Write Operations Hidden as Reads¶
Threat: An attacker crafts a query that deletes/modifies data but is marked as "read-only" by the analyzer.
Protection: This is NOT POSSIBLE because:
- Cypher keywords must be whole words with boundaries
- We search ALL occurrences of each keyword
- There's no way to "hide" a keyword in valid Cypher syntax
Example attacks that are blocked:
-- Hiding DELETE - NOT POSSIBLE
MATCH (n) WHERE n.name = 'safe' DELETE n
-- ✅ DELETE detected, marked as write
-- Splitting keywords - NOT POSSIBLE (invalid Cypher)
MATCH (n) DE/*comment*/LETE n
-- ✅ Invalid syntax, rejected by executor
-- Unicode tricks - NOT POSSIBLE
MATCH (n) DELETE n
-- ✅ Not ASCII "DELETE", not valid Cypher
⚡ Accepted: Read Operations Marked as Writes (False Positives)¶
Scenario: A read-only query is incorrectly marked as a write operation.
Impact: Performance only (not security) - Query is not cached - May trigger unnecessary cache invalidation - Query still executes correctly
When this happens:
| Scenario | Example Query | What Happens |
|---|---|---|
| Property named like keyword | RETURN n.delete | Marked as write |
| Keyword in string literal | WHERE name = 'DELETE me' | Marked as write |
| Keyword in comment | RETURN n // TODO: delete | Marked as write |
Why we accept this:
- Conservative is safer - Better to not cache than to cache incorrectly
- Rare occurrence - Few users name properties "delete" or "create"
- No security impact - Query executes correctly, just not cached
- Simple implementation - Complex parsing adds attack surface
✅ Protected: False Positives Cannot Leak Data¶
Concern: Could an attacker use a false positive to gain unauthorized data access?
Answer: No. The QueryAnalyzer is completely separate from access control.
┌─────────────────────────────────────────────────────────────┐
│ Security Boundary │
├─────────────────────────────────────────────────────────────┤
│ │
│ QueryAnalyzer Access Control (RBAC) │
│ ┌────────────────┐ ┌────────────────┐ │
│ │ Determines: │ │ Determines: │ │
│ │ • Caching │ │ • Can user │ │
│ │ • Invalidation │ │ read nodes? │ │
│ │ │ │ • Can user │ │
│ │ Does NOT │ │ write nodes? │ │
│ │ affect: │ │ │ │
│ │ • Permissions │ │ ALWAYS │ │
│ │ • Authorization│ │ ENFORCED │ │
│ └────────────────┘ └────────────────┘ │
│ │ │ │
│ │ Performance │ Security │
│ ▼ ▼ │
│ ┌────────────────┐ ┌────────────────┐ │
│ │ Cache miss │ │ Query allowed │ │
│ │ (slower, but │ │ or denied │ │
│ │ still works) │ │ (enforced) │ │
│ └────────────────┘ └────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Attack scenarios that DO NOT work:
| Attack | Why It Fails |
|---|---|
| "Read secret data by triggering write path" | Write path doesn't grant extra permissions |
| "Learn label exists via cache invalidation" | Query must succeed first (already requires access) |
| "Bypass RBAC by confusing the analyzer" | Analyzer has no role in authorization |
| "Force stale data via false positive" | False positive = no caching = always fresh data |
Example:
What happens: 1. QueryAnalyzer sees "delete" → marks as write (false positive) 2. Access control checks if user can read :Secret → DENIED 3. Query rejected - attacker gets nothing 4. Cache invalidation never happens (query failed)
The false positive only affects step 1 (caching decision). The security check in step 2 is completely independent and always enforced.
Cache Security¶
Result Caching¶
Only read-only queries are cached:
// Cached (read-only):
MATCH (n:Person) RETURN n.name // ✅ Cached
CALL db.labels() // ✅ Cached (db.* procedures)
SHOW INDEXES // ✅ Cached
// NOT cached (write operations):
CREATE (n:Person {name: 'Alice'}) // ❌ Not cached
MATCH (n) SET n.updated = timestamp() // ❌ Not cached
CALL gds.graph.drop('myGraph') // ❌ Not cached (non-db.* procedure)
Cache Invalidation¶
Write operations invalidate relevant caches:
// Smart invalidation by label:
CREATE (n:Person) // Invalidates :Person cache
MATCH (n:Company) DELETE n // Invalidates :Company cache
// Full invalidation (no label detected):
MATCH (n) DELETE n // Invalidates ALL caches
Cache Key Security¶
Cache keys include: - Normalized query string - Parameter values (hashed)
This prevents: - Cache poisoning across different parameters - Cache confusion between similar queries
Procedure Call Security¶
Read-Only Procedures (Cached)¶
Only CALL db.* procedures are marked as read-only:
CALL db.labels() -- ✅ Read-only, cached
CALL db.propertyKeys() -- ✅ Read-only, cached
CALL db.relationshipTypes() -- ✅ Read-only, cached
Write Procedures (Not Cached)¶
All other procedures are conservatively treated as writes:
CALL gds.graph.project(...) -- ❌ May be write, not cached
CALL gds.graph.drop(...) -- ❌ Is a write, not cached
CALL apoc.create.node(...) -- ❌ Is a write, not cached
CALL custom.myProcedure(...) -- ❌ Unknown, not cached
Implementation Details¶
Keyword Detection¶
// containsKeyword searches ALL occurrences, checking word boundaries
func containsKeyword(upper, keyword string) bool {
// Iterates through string finding each occurrence
// Checks character before: must not be alphanumeric or _
// Checks character after: must not be alphanumeric or _
// Returns true if ANY occurrence is a valid keyword
}
Word Boundary Rules¶
A keyword match requires: - Before: Start of string, or non-alphanumeric (except _) - After: End of string, or non-alphanumeric (except _)
"DELETE n" → DELETE matched (space before, space after)
"n.delete" → delete matched (. is not alphanumeric) - FALSE POSITIVE
"ToDelete" → DELETE not matched (o before)
"DELETED" → DELETE not matched (D after)
Query Types Reference¶
Write Operations (Never Cached)¶
| Keyword | Detection | Example |
|---|---|---|
CREATE | Word boundary | CREATE (n:Node) |
MERGE | Word boundary | MERGE (n:Node {id: 1}) |
DELETE | Word boundary | MATCH (n) DELETE n |
DETACH DELETE | Phrase | MATCH (n) DETACH DELETE n |
SET | Word boundary | MATCH (n) SET n.x = 1 |
REMOVE | Word boundary | MATCH (n) REMOVE n.label |
Read Operations (Cacheable)¶
| Pattern | Requirement | Example |
|---|---|---|
MATCH ... RETURN | No write keywords | MATCH (n) RETURN n |
CALL db.* | Starts with CALL db. | CALL db.labels() |
SHOW | Starts with SHOW | SHOW INDEXES |
Schema Operations (Special Handling)¶
| Operation | Cached | Invalidates |
|---|---|---|
CREATE INDEX | No | Schema cache |
DROP INDEX | No | Schema cache |
CREATE CONSTRAINT | No | Schema cache |
DROP CONSTRAINT | No | Schema cache |
Testing & Verification¶
Security Tests¶
The following attack patterns are verified in tests:
// All write operations detected:
"MATCH (n) DELETE n" // HasDelete = true ✅
"MATCH (n) WHERE type = 'safe' DELETE n" // HasDelete = true ✅
"CREATE (n:Node)" // HasCreate = true ✅
"MERGE (n:Node)" // HasMerge = true ✅
"MATCH (n) SET n.x = 1" // HasSet = true ✅
"MATCH (n) REMOVE n.label" // HasRemove = true ✅
// False positives (safe, not cached):
"RETURN n.delete" // HasDelete = true (false positive)
"WHERE name = 'DELETE'" // HasDelete = true (false positive)
Running Security Tests¶
# Run all cypher tests including security checks
cd nornicdb
go test ./pkg/cypher/... -v
# Run specific cache/security tests
go test ./pkg/cypher/... -run "Cache|Security" -v
Best Practices¶
For Query Authors¶
- Avoid keyword-named properties - Don't name properties
delete,create,set, etc. - Use parameters for values -
WHERE name = $nameinstead ofWHERE name = 'DELETE' - Explicit transactions for writes - Use
BEGIN/COMMITfor critical write operations
For Administrators¶
- Monitor cache hit rates - Low hit rates may indicate false positives
- Review slow query logs - Uncached queries appear more frequently
- Use production mode - Ensures strict security settings
Summary¶
| Aspect | Protection Level | Notes |
|---|---|---|
| Write ops hidden as reads | ✅ Fully Protected | Not possible in valid Cypher |
| Read ops marked as writes | ⚡ Accepted | Performance impact only |
| False positive data leakage | ✅ Protected | Analyzer doesn't affect access control |
| Cache poisoning | ✅ Protected | Keys include parameters |
| Procedure writes | ✅ Protected | Only db.* cached |
| Access control bypass | ✅ N/A | Analyzer not used for auth |
| Cache timing side-channel | ✅ Protected | Query must succeed before cache action |
See Also: - HTTP Security - Network-level protections - Compliance Guide - Regulatory requirements