Performance Tuning Guide¶

This guide provides comprehensive performance optimization strategies for KrakenHashes deployments. All recommendations are based on the actual implementation details and system architecture.

Table of Contents¶

Database Performance
Job Execution Optimization
Agent Performance
File I/O Optimization
Network and WebSocket Tuning
Storage Performance
Monitoring and Benchmarking
System Settings Reference

Database Performance¶

Connection Pool Configuration¶

The system uses PostgreSQL with connection pooling. Current default settings:

// From backend/internal/db/db.go
db.SetMaxOpenConns(25)      // Maximum number of open connections
db.SetMaxIdleConns(5)       // Maximum number of idle connections
db.SetConnMaxLifetime(5 * time.Minute)  // Connection lifetime

Optimization recommendations:

For small deployments (1-10 agents):

# Keep defaults
MAX_OPEN_CONNS=25
MAX_IDLE_CONNS=5

For medium deployments (10-50 agents):

# Increase connection pool
MAX_OPEN_CONNS=50
MAX_IDLE_CONNS=10

For large deployments (50+ agents):

# Use external connection pooler (PgBouncer)
MAX_OPEN_CONNS=100
MAX_IDLE_CONNS=20

Index Optimization¶

The system uses extensive indexing. Key performance indexes:

-- Agent performance queries
idx_agents_status
idx_agents_last_heartbeat
idx_agent_performance_metrics_device_lookup (composite)

-- Job execution queries
idx_job_tasks_job_chunk (composite)
idx_job_tasks_status
idx_job_executions_status

-- Hash lookups
idx_hashes_hash_value (unique)
idx_hashlists_status

Monitoring index usage:

-- Check index usage statistics
SELECT 
    schemaname,
    tablename,
    indexname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC;

-- Find missing indexes
SELECT 
    schemaname,
    tablename,
    attname,
    n_distinct,
    most_common_vals
FROM pg_stats
WHERE schemaname = 'public'
AND n_distinct > 100
AND tablename NOT IN (
    SELECT tablename 
    FROM pg_indexes 
    WHERE schemaname = 'public'
);

Query Optimization¶

Batch Processing Configuration:

# Environment variable
export KH_HASHLIST_BATCH_SIZE=1000  # Default: 1000

# Recommendations:
# - Small hashlists (<100K): 500
# - Medium hashlists (100K-1M): 1000
# - Large hashlists (>1M): 2000-5000

Pagination Optimization:
Use cursor-based pagination for large result sets
Limit page sizes to 100-500 items
Add appropriate indexes for ORDER BY columns

Job Execution Optimization¶

Chunking System Configuration¶

The dynamic chunking system optimizes workload distribution based on agent performance.

Key system settings:

-- Configure chunk behavior
UPDATE system_settings SET value = '20' 
WHERE key = 'chunk_fluctuation_percentage';  -- Default: 20%

-- Benchmark cache duration
UPDATE system_settings SET value = '168' 
WHERE key = 'benchmark_cache_duration_hours';  -- Default: 168 (7 days)

Chunk Size Calculation¶

The system calculates chunks based on: 1. Agent benchmark speeds 2. Desired chunk duration 3. Attack mode modifiers 4. Available keyspace

Optimization strategies:

For GPU clusters with similar performance:

-- Larger chunks, less overhead
UPDATE job_executions 
SET chunk_duration = 3600  -- 1 hour chunks
WHERE id = ?;

For mixed hardware environments:

-- Smaller chunks for better distribution
UPDATE job_executions 
SET chunk_duration = 900  -- 15 minute chunks
WHERE id = ?;

For time-sensitive jobs:

-- Very small chunks for quick feedback
UPDATE job_executions 
SET chunk_duration = 300  -- 5 minute chunks
WHERE id = ?;

Job Priority and Scheduling¶

The scheduler processes jobs based on priority and available agents:

Priority levels:
Critical: Process immediately
High: Process within minutes
Normal: Standard queue processing
Low: Process when resources available

Scheduling optimization:

-- Ensure proper job distribution
UPDATE system_settings 
SET value = '30' 
WHERE key = 'scheduler_check_interval_seconds';

Agent Performance¶

Hardware Detection and Benchmarking¶

The system automatically detects GPU capabilities and runs benchmarks.

Benchmark optimization:

-- Configure speedtest timeout
UPDATE system_settings 
SET value = '180'  -- 3 minutes
WHERE key = 'speedtest_timeout_seconds';

-- For faster benchmarks (less accurate)
UPDATE system_settings 
SET value = '60'  -- 1 minute
WHERE key = 'speedtest_timeout_seconds';

Agent Configuration¶

GPU-specific optimizations:

# Agent config.yaml
extra_parameters: "--optimize-kernel-workload --force"

# For NVIDIA GPUs
extra_parameters: "-O -w 4"

# For AMD GPUs
extra_parameters: "-O -w 3"

Memory management:

# Limit GPU memory usage
extra_parameters: "--gpu-memory-fraction=0.8"

Workload Distribution¶

The system supports multiple distribution strategies:

Round-robin: Equal distribution
Performance-based: More work to faster agents
Priority-based: Specific agents for specific jobs

File I/O Optimization¶

Hash List Processing¶

The system uses buffered I/O and batch processing for efficient file handling.

Current implementation: - Buffered reading with bufio.Scanner - Configurable batch sizes - Streaming processing (no full file load)

Optimization tips:

For NVMe storage:

export KH_HASHLIST_BATCH_SIZE=5000  # Larger batches

For network storage:

export KH_HASHLIST_BATCH_SIZE=500   # Smaller batches

File upload limits:

export KH_MAX_UPLOAD_SIZE_MB=32     # Default: 32MB
# Increase for trusted environments
export KH_MAX_UPLOAD_SIZE_MB=256    # 256MB

File Synchronization¶

The agent file sync system uses: - Chunk-based transfers - Resume capability - Integrity verification

Optimization:

LAN deployments:
Increase chunk sizes
Disable compression
WAN deployments:
Enable compression
Smaller chunk sizes
More aggressive retry policies

Network and WebSocket Tuning¶

WebSocket Configuration¶

The system uses WebSocket for real-time agent communication.

Key optimizations:

Message processing:
Asynchronous handlers for non-blocking operation
Goroutine-based processing for heavy operations
30-second timeout for async operations

Heartbeat optimization:

// Agent sends heartbeat every 30 seconds
// Server expects heartbeat within 90 seconds

Connection management:

# Nginx configuration for WebSocket
proxy_read_timeout 3600s;
proxy_send_timeout 3600s;
proxy_connect_timeout 60s;

# Buffer sizes
proxy_buffer_size 64k;
proxy_buffers 8 32k;
proxy_busy_buffers_size 128k;

TLS/SSL Performance¶

The system supports multiple TLS modes with configurable parameters:

# Certificate configuration
export KH_CERT_KEY_SIZE=2048        # Faster handshakes
# or
export KH_CERT_KEY_SIZE=4096        # Better security

# For high-traffic deployments
export KH_TLS_SESSION_CACHE=on
export KH_TLS_SESSION_TIMEOUT=300

Storage Performance¶

Directory Structure Optimization¶

/data/krakenhashes/
├── binaries/      # Hashcat binaries (SSD recommended)
├── wordlists/     # Large wordlists (HDD acceptable)
├── rules/         # Rule files (SSD preferred)
├── hashlists/     # User hashlists (SSD recommended)
└── temp/          # Temporary files (RAM disk optimal)

Storage Recommendations¶

SSD for critical paths:
Database files
Hashcat binaries
Active hashlists
Temporary processing
HDD acceptable for:
Large wordlist storage
Archived hashlists
Backup data

RAM disk for temporary files:

# Create RAM disk for temp files
sudo mkdir -p /mnt/ramdisk
sudo mount -t tmpfs -o size=2G tmpfs /mnt/ramdisk

# Link to KrakenHashes temp
ln -s /mnt/ramdisk /data/krakenhashes/temp

Monitoring and Benchmarking¶

Metrics Collection and Retention¶

The system includes automatic metrics aggregation:

-- Configure retention
UPDATE system_settings 
SET value = '30'  -- Keep realtime data for 30 days
WHERE key = 'metrics_retention_days';

-- Enable/disable aggregation
UPDATE system_settings 
SET value = 'true' 
WHERE key = 'enable_aggregation';

Aggregation levels: - Realtime → Daily (after 24 hours) - Daily → Weekly (after 7 days) - Cleanup runs daily at 2 AM

Performance Monitoring Queries¶

-- Agent performance overview
SELECT 
    a.name,
    a.status,
    COUNT(DISTINCT jt.id) as active_tasks,
    AVG(apm.hashes_per_second) as avg_speed,
    MAX(apm.temperature) as max_temp
FROM agents a
LEFT JOIN job_tasks jt ON a.id = jt.agent_id AND jt.status = 'in_progress'
LEFT JOIN agent_performance_metrics apm ON a.id = apm.agent_id
WHERE apm.created_at > NOW() - INTERVAL '1 hour'
GROUP BY a.id, a.name, a.status;

-- Job execution performance
SELECT 
    je.id,
    je.status,
    je.created_at,
    je.completed_at,
    je.progress,
    COUNT(jt.id) as total_chunks,
    COUNT(CASE WHEN jt.status = 'completed' THEN 1 END) as completed_chunks
FROM job_executions je
LEFT JOIN job_tasks jt ON je.id = jt.job_execution_id
GROUP BY je.id
ORDER BY je.created_at DESC;

Benchmarking Best Practices¶

Initial benchmarking:
Run comprehensive benchmarks on agent registration
Test all hash types your organization uses
Store results for 7 days (default)
Periodic re-benchmarking:
After driver updates
After hardware changes
Monthly for consistency

Benchmark commands:

# Force re-benchmark for specific agent
curl -X POST https://api.krakenhashes.com/agents/{id}/benchmark \
  -H "Authorization: Bearer $TOKEN"

System Settings Reference¶

Setting Key	Default	Description	Optimization Range
`chunk_fluctuation_percentage`	20	Threshold for merging small chunks	10-30%
`benchmark_cache_duration_hours`	168	How long to cache benchmark results	24-720 hours
`metrics_retention_days`	30	Realtime metrics retention	7-90 days
`enable_aggregation`	true	Enable metrics aggregation	true/false
`speedtest_timeout_seconds`	180	Benchmark timeout	60-600 seconds
`scheduler_check_interval_seconds`	30	Job scheduler interval	10-60 seconds

Environment Variables¶

Variable	Default	Description	Optimization Tips
`KH_HASHLIST_BATCH_SIZE`	1000	Database batch insert size	500-5000 based on hardware
`KH_MAX_UPLOAD_SIZE_MB`	32	Maximum file upload size	32-1024 based on trust
`DATABASE_MAX_OPEN_CONNS`	25	Max database connections	25-100 based on load
`DATABASE_MAX_IDLE_CONNS`	5	Max idle connections	20% of max open

Performance Troubleshooting¶

Common Bottlenecks¶

Database connection exhaustion:
Symptom: "too many connections" errors
Solution: Increase connection pool or use PgBouncer
Slow hash imports:
Symptom: Hashlist processing takes hours
Solution: Increase batch size, use SSD storage
Agent communication delays:
Symptom: Delayed job updates
Solution: Check network latency, adjust timeouts
Memory exhaustion:
Symptom: OOM errors during processing
Solution: Reduce batch sizes, add swap space

Performance Checklist¶

Database indexes are being used (check pg_stat_user_indexes)
Connection pool sized appropriately for agent count
Batch sizes optimized for hardware
Metrics retention configured
Storage using appropriate media (SSD/HDD)
Network timeouts adjusted for environment
Benchmark cache duration set appropriately
Chunk sizes appropriate for job types

Recommended Configurations¶

Small Deployment (1-10 agents)¶

# Keep most defaults
export KH_HASHLIST_BATCH_SIZE=1000
export DATABASE_MAX_OPEN_CONNS=25
# Use default chunk fluctuation (20%)

Medium Deployment (10-50 agents)¶

export KH_HASHLIST_BATCH_SIZE=2000
export DATABASE_MAX_OPEN_CONNS=50
export DATABASE_MAX_IDLE_CONNS=10
# Adjust chunk fluctuation to 15%
UPDATE system_settings SET value = '15' WHERE key = 'chunk_fluctuation_percentage';

Large Deployment (50+ agents)¶

export KH_HASHLIST_BATCH_SIZE=5000
export DATABASE_MAX_OPEN_CONNS=100
export DATABASE_MAX_IDLE_CONNS=20
# Use PgBouncer for connection pooling
# Adjust chunk fluctuation to 10%
UPDATE system_settings SET value = '10' WHERE key = 'chunk_fluctuation_percentage';
# Reduce metrics retention
UPDATE system_settings SET value = '14' WHERE key = 'metrics_retention_days';

Next Steps¶

Review current system metrics
Identify bottlenecks using monitoring queries
Apply appropriate optimizations
Monitor impact and adjust
Document environment-specific settings

For additional support, consult the System Architecture documentation or contact the development team.