GPU Acceleration¶
10-100x speedup for vector operations with GPU acceleration.
Overview¶
NornicDB leverages GPU acceleration for: - Vector similarity search - K-means clustering - Embedding generation - Matrix operations
Supported Backends¶
| Backend | Platform | Performance |
|---|---|---|
| Metal | macOS (Apple Silicon) | Excellent |
| CUDA | NVIDIA GPUs | Excellent |
| Vulkan | Cross-platform | Good |
| CPU | Fallback | Baseline |
Automatic Detection¶
GPU acceleration is automatically detected and enabled:
If no GPU is available:
Configuration¶
Backend selection environment variable¶
You can explicitly request which GPU backend NornicDB should use by setting the NORNICDB_GPU_BACKEND environment variable. When unset the server will auto-detect an available backend.
Accepted values: - auto (default) — automatically detect the best available backend - vulkan — use the Vulkan backend - cuda — use the CUDA backend (NVIDIA GPUs) - metal — use Apple's Metal backend (macOS/Apple Silicon) - cpu — force CPU-only fallback
If a requested backend is unavailable at runtime, NornicDB will fall back to a compatible backend or to CPU processing.
Examples
PowerShell (temporary for current process):
Unix shell:
Docker run example:
docker run -e NORNICDB_GPU_BACKEND=vulkan \
-p 7474:7474 -p 7687:7687 \
--device /dev/dri:/dev/dri \
timothyswt/nornicdb-amd64-vulkan:latest
docker-compose snippet:
services:
nornicdb:
image: timothyswt/nornicdb-amd64-vulkan:latest
environment:
- NORNICDB_GPU_BACKEND=vulkan
devices:
- /dev/dri:/dev/dri
Continue to the Docker section for platform-specific build/deploy examples.
Docker with GPU¶
Apple Silicon (Metal):
NVIDIA (CUDA):
services:
nornicdb:
image: timothyswt/nornicdb-amd64-cuda:latest
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Performance Benchmarks¶
Vector Search (1M vectors, 1024 dimensions)¶
| Backend | Latency | Throughput |
|---|---|---|
| Metal (M2) | 2ms | 500 qps |
| CUDA (A100) | 1ms | 1000 qps |
| CPU (AVX2) | 50ms | 20 qps |
K-Means Clustering (100K points)¶
| Backend | Time | Speedup |
|---|---|---|
| Metal | 120ms | 50x |
| CUDA | 80ms | 75x |
| CPU | 6000ms | 1x |
GPU Memory Management¶
Memory Limits¶
// GPU manager respects memory limits
gpuManager := gpu.NewManager(&gpu.Config{
Backend: gpu.BackendMetal,
MaxMemoryMB: 2048, // Limit to 2GB
BatchSize: 1000, // Process in batches
})
Automatic Batching¶
For large datasets, operations are automatically batched:
// Search 10M vectors in batches
results, err := searchService.VectorSearch(ctx, query, 10000000, 10)
// Automatically batched to fit GPU memory
Supported Operations¶
Vector Search¶
// GPU-accelerated cosine similarity search
results, err := db.HybridSearch(ctx,
"query text",
queryEmbedding, // GPU computes similarity
nil,
100,
)
K-Means Clustering¶
// GPU-accelerated clustering
clusterer := kmeans.NewGPUClusterer(&kmeans.Config{
K: 10,
MaxIters: 100,
Tolerance: 0.0001,
UseGPU: true,
})
centroids, assignments, err := clusterer.Cluster(vectors)
Embedding Generation¶
// GPU-accelerated local embeddings
embedder := embed.NewLocalEmbedder(&embed.Config{
ModelPath: "/models/mxbai-embed-large.gguf",
GPULayers: -1, // Auto-detect
})
Troubleshooting¶
GPU Not Detected¶
Out of Memory¶
# Reduce batch size
export NORNICDB_GPU_BATCH_SIZE=500
# Limit GPU memory
export NORNICDB_GPU_MAX_MEMORY_MB=1024
CUDA Errors¶
# Check NVIDIA driver
nvidia-smi
# Check CUDA version
nvcc --version
# Ensure container has GPU access
docker run --gpus all nvidia/cuda:11.8-base nvidia-smi
Metal Errors (macOS)¶
# Check Metal support
system_profiler SPDisplaysDataType | grep Metal
# Ensure running on Apple Silicon
uname -m # Should show arm64
Fallback Behavior¶
If GPU acceleration fails, NornicDB automatically falls back to CPU:
// Automatic fallback
result, err := gpuManager.VectorSearch(vectors, query, k)
if err == gpu.ErrGPUUnavailable {
// Falls back to CPU automatically
result = cpuSearch(vectors, query, k)
}
See Also¶
- Vector Search - Search guide
- Memory Decay - Time-based scoring
- Performance - Benchmarks