Job Update System¶
Overview¶
The KrakenHashes Job Update System automatically recalculates job keyspaces when associated wordlists, rules, or potfiles change during execution. This is a "going forward" system - when files are updated, only undispatched work is affected. Already-assigned tasks continue with their original parameters, ensuring consistency while allowing jobs to benefit from updated resources.
Core Philosophy: Forward-Only Updates¶
The system operates on these principles:
- No Deficit Tracking: The system doesn't track "missed" work from updates that occur after tasks are dispatched
- Current State Calculation: Keyspaces are recalculated based on the current file state and remaining work
- Non-Disruptive: Running tasks are never interrupted or restarted
- Automatic Adjustment: Jobs automatically adapt to file changes without user intervention
How It Works¶
Directory Monitoring¶
The system continuously monitors three key directories:
- Wordlists:
/data/krakenhashes/wordlists/
- Rules:
/data/krakenhashes/rules/
- Potfile: Special handling via staging mechanism
Every 30 seconds (configurable), the directory monitor: 1. Calculates MD5 hashes of all monitored files 2. Compares with previous hashes to detect changes 3. Updates file metadata in the database 4. Triggers job updates for affected jobs
Change Detection Flow¶
Wordlist Updates¶
When a wordlist file changes (words added or removed):
For Jobs WITHOUT Rule Splitting¶
- Base keyspace updates to new word count
- Effective keyspace recalculates:
- With rules:
new_wordlist_size × multiplication_factor
- Without rules:
new_wordlist_size
For Jobs WITH Rule Splitting¶
The system accounts for already-dispatched rule chunks:
- Calculates theoretical new effective keyspace
- Determines "missed" keyspace:
words_added × rules_already_dispatched
- Actual effective keyspace:
theoretical - missed
Example:
Original: 1,000,000 words × 10,000 rules = 10 billion keyspace
After 5,000 rules dispatched, add 100,000 words:
- Theoretical: 1,100,000 × 10,000 = 11 billion
- Missed: 100,000 × 5,000 = 500 million
- Actual: 11 billion - 500 million = 10.5 billion
Rule Updates¶
When a rule file changes (rules added or removed):
Jobs Without Tasks Yet¶
- Simple recalculation:
base_keyspace × new_rule_count
- Multiplication factor updates to new rule count
Jobs With Existing Tasks¶
For rule-splitting jobs: 1. Checks highest dispatched rule index 2. If new rule count ≤ max dispatched: Job effectively complete 3. Otherwise: Updates multiplication factor and recalculates
Example:
Original: 10,000 rules, 5,000 dispatched
Rules reduced to 4,000: Job marked complete (all remaining rules gone)
Rules increased to 12,000: 7,000 rules remain to process
Potfile Updates¶
The potfile (collection of cracked passwords) has special handling:
Staging Mechanism¶
- Cracked passwords accumulate in a staging table
- Periodic or manual refresh moves staged entries to potfile
- Potfile treated as a special wordlist for job purposes
Update Process¶
- Manual Refresh: User triggers from frontend
- Staging Integration: Moves cracked passwords to main potfile
- Line Count Update: Updates wordlist metadata
- Job Updates: Triggers same update logic as regular wordlists
Key Differences¶
- Not monitored by directory monitor (excluded from scans)
- Updates via database staging, not file watching
- Requires explicit refresh action
- Always grows (passwords only added, never removed)
Keyspace Recalculation Logic¶
Basic Formula¶
Where: - Base Keyspace: Current wordlist size - Multiplication Factor: Number of rules (or 1 if no rules)
Adjustments for Dispatched Work¶
For rule-splitting jobs with updates:
This ensures already-dispatched tasks aren't double-counted.
Real-World Examples¶
Scenario 1: Growing Wordlist¶
Initial State: - Wordlist: 1 million words - Rules: 1,000 - No tasks dispatched yet
After Adding 100,000 Words: - New base: 1.1 million - New effective: 1.1 billion - All future tasks use updated wordlist
Scenario 2: Rule File Expansion During Execution¶
Initial State: - Job using rule splitting - 10,000 rules, split into 100 chunks - 50 chunks already dispatched (5,000 rules)
After Adding 2,000 Rules: - Total rules: 12,000 - Remaining: 7,000 rules (chunks 51-120) - Future chunks use expanded rule set
Scenario 3: Potfile Growth¶
Initial State: - Potfile job with 1,000 existing passwords - Rules: 500 - Effective keyspace: 500,000
After Cracking Campaign: - 200 new passwords cracked - Manual refresh triggered - New base: 1,200 passwords - New effective: 600,000
Configuration¶
Directory Monitor Settings¶
Located in backend configuration:
Setting | Default | Description |
---|---|---|
Monitor Interval | 30s | How often to check for file changes |
MD5 Hash Check | Enabled | Method for detecting changes |
Concurrent Updates | Enabled | Allow parallel job updates |
System Behavior Settings¶
Setting | Default | Description |
---|---|---|
Auto-update Jobs | Enabled | Automatically update affected jobs |
Update Lock Timeout | 60s | Maximum time to wait for job lock |
Staging Refresh Interval | Manual | Potfile staging refresh trigger |
Technical Implementation¶
Components¶
- DirectoryMonitorService: Detects file changes via MD5 hashing
- JobUpdateService: Handles keyspace recalculation logic
- PotfileService: Manages potfile staging and updates
- Repository Layer: Database operations for job updates
Database Tables Involved¶
job_executions
: Stores base_keyspace, effective_keyspace, multiplication_factorjob_tasks
: Tracks dispatched work (rule_start_index, rule_end_index)wordlists
: Metadata including word_count, file_hashrules
: Metadata including rule_count, file_hashpotfile_staging
: Temporary storage for cracked passwords
Locking Strategy¶
The system uses per-job locks to prevent race conditions:
Best Practices¶
For Users¶
- Expect Keyspace Changes: Don't be alarmed if keyspaces update during execution
- Manual Potfile Refresh: Remember to refresh potfile after cracking campaigns
- Monitor Progress: Check effective keyspace to understand total work
- Plan Updates: Large file changes can significantly affect running jobs
For Administrators¶
- Monitor Disk Space: File updates may require temporary storage
- Adjust Check Intervals: Balance between responsiveness and system load
- Review Logs: Check for update failures or lock timeouts
- Database Maintenance: Ensure potfile staging table doesn't grow too large
For Developers¶
- Respect Forward-Only: Never try to retroactively update dispatched tasks
- Use Job Locks: Always lock jobs during updates to prevent races
- Handle Errors Gracefully: File update failures shouldn't crash jobs
- Test Edge Cases: Consider jobs with no tasks, completed tasks, etc.
Troubleshooting¶
Common Issues¶
Keyspace Not Updating: - Verify file actually changed (MD5 hash different) - Check directory monitor is running - Ensure job is in eligible state (pending/running/paused)
Incorrect Effective Keyspace: - Verify multiplication_factor is set correctly - Check if job uses rule splitting - Review calculation for "missed" keyspace
Potfile Not Updating Jobs: - Ensure manual refresh was triggered - Check potfile staging has new entries - Verify job references potfile wordlist
Debug Logging¶
Enable debug logging to trace update flow:
DEBUG: Directory monitor detected change
DEBUG: Handling wordlist update, old: 1000000, new: 1100000
DEBUG: Updated job keyspace, effective: 1100000000
Limitations¶
- No Retroactive Updates: Already-dispatched work won't get new words/rules
- Forward Progress Only: System doesn't track or compensate for missed combinations
- Manual Potfile Refresh: Requires user action to trigger potfile updates
- File Lock Conflicts: Rapid file changes might cause temporary update delays
Future Enhancements¶
Potential improvements under consideration:
- Deficit Tracking: Optional mode to track missed combinations
- Automatic Potfile Refresh: Configurable automatic refresh intervals
- Smart Chunking: Re-chunk remaining work when files change significantly
- Update History: Track all keyspace changes for job audit trail
- Predictive Updates: Estimate impact before applying changes
Summary¶
The Job Update System ensures KrakenHashes jobs remain accurate and efficient as resources change. By following a forward-only philosophy, it provides a balance between consistency for running tasks and adaptability for future work. Understanding this system helps explain why job keyspaces may change during execution and how the system maintains integrity without disrupting active cracking operations.