LM/NTLM Linking Architecture¶
Overview¶
KrakenHashes v1.2.1+ introduces comprehensive support for LM (LAN Manager) and NTLM hash linking, enabling intelligent processing of pwdump-format files and advanced Windows password cracking workflows. This document details the technical architecture, database schema, processing pipeline, and design decisions.
Architectural Layers¶
The LM/NTLM linking system operates across three database layers:
- Hashlist-to-Hashlist Links (
linked_hashlists): High-level relationship between entire hashlists - Hash-to-Hash Links (
linked_hashes): Individual hash pair relationships - LM Metadata (
lm_hash_metadata): Partial crack tracking for LM hashes
This layered approach enables: - Flexible linking strategies (not limited to LM/NTLM) - Efficient analytics calculations - Partial crack tracking without impacting other hash types - Clean separation of concerns
Database Schema¶
linked_hashlists Table¶
Manages relationships between entire hashlists (e.g., LM hashlist ↔ NTLM hashlist).
CREATE TABLE linked_hashlists (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
hashlist_id_1 BIGINT NOT NULL REFERENCES hashlists(id) ON DELETE CASCADE,
hashlist_id_2 BIGINT NOT NULL REFERENCES hashlists(id) ON DELETE CASCADE,
link_type VARCHAR(50) NOT NULL, -- 'lm_ntlm', extensible for future types
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT unique_hashlist_link UNIQUE (hashlist_id_1, hashlist_id_2),
CONSTRAINT no_self_link CHECK (hashlist_id_1 != hashlist_id_2)
);
CREATE INDEX idx_linked_hashlists_id2 ON linked_hashlists(hashlist_id_2);
CREATE INDEX idx_linked_hashlists_type ON linked_hashlists(link_type);
Design Decisions: - Bidirectional Uniqueness: Prevents both (A, B) and (B, A) from existing - Generic link_type: Enables future link types (e.g., sha1_ntlm for hash type correlations) - CASCADE DELETE: When a hashlist is deleted, links are automatically removed - Reverse Index: idx_linked_hashlists_id2 enables efficient bidirectional lookups
Use Cases: - Track which LM and NTLM hashlists were created from the same pwdump file - Calculate effective hashlist count in analytics (linked pairs count as ONE) - Determine when to create individual hash-to-hash links
linked_hashes Table¶
Manages relationships between individual hash records (e.g., specific LM hash ↔ specific NTLM hash for same user).
CREATE TABLE linked_hashes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
hash_id_1 UUID NOT NULL REFERENCES hashes(id) ON DELETE CASCADE,
hash_id_2 UUID NOT NULL REFERENCES hashes(id) ON DELETE CASCADE,
link_type VARCHAR(50) NOT NULL, -- 'lm_ntlm'
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT unique_hash_link UNIQUE (hash_id_1, hash_id_2),
CONSTRAINT no_self_link CHECK (hash_id_1 != hash_id_2)
);
CREATE INDEX idx_linked_hashes_id2 ON linked_hashes(hash_id_2);
CREATE INDEX idx_linked_hashes_type ON linked_hashes(link_type);
Design Decisions: - Hash-Level Granularity: Links specific hash records, not just hashlists - Username/Domain Based: Links created by matching username and domain columns - Analytics Support: Enables "Linked Hash Correlation" statistics - Independent of Hashlists: Links persist even if hashlists are deleted (CASCADE handles cleanup)
Use Cases: - Show correlation: "Administrator's LM cracked but NTLM still unknown" - Generate statistics: "X linked pairs have both cracked" - Enable domain-filtered correlation analysis
lm_hash_metadata Table¶
Tracks partial crack status for LM hashes (mode 3000 only).
CREATE TABLE lm_hash_metadata (
hash_id UUID PRIMARY KEY REFERENCES hashes(id) ON DELETE CASCADE,
first_half_cracked BOOLEAN NOT NULL DEFAULT FALSE,
second_half_cracked BOOLEAN NOT NULL DEFAULT FALSE,
first_half_password VARCHAR(7), -- Max 7 chars (LM first half)
second_half_password VARCHAR(7), -- Max 7 chars (LM second half)
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_lm_metadata_crack_status
ON lm_hash_metadata(first_half_cracked, second_half_cracked);
CREATE INDEX idx_lm_metadata_hash_id ON lm_hash_metadata(hash_id);
Design Decisions: - Hash-Specific: Only created for LM hashes (type 3000), zero impact on other types - Separate Password Storage: Stores 7-char fragments, not full password (assembled on demand) - Composite Index: (first_half_cracked, second_half_cracked) enables fast partial crack queries - VARCHAR(7) Limit: Enforces LM's 7-character half constraint at database level
Use Cases: - Track partial crack status: "First half cracked, second half pending" - Analytics: "X LM hashes are partially cracked" - Strategic intelligence: "Known half reduces keyspace by factor of 68 trillion"
Upload Flow¶
Pwdump Format Detection¶
When a user uploads a hashlist file:
- File Selection: User selects file via upload dialog
- Automatic Detection: Frontend calls
/api/hashlists/detect-linkedendpoint - Backend Analysis:
- Reads first 1000 lines (sample)
- Checks for pwdump format:
DOMAIN\user:RID:LM:NTLM::: - Counts LM hashes, NTLM hashes, blank LM hashes
- User Dialog: If both types found, present options:
- "Upload as Single List"
- "Create Linked Lists"
Detection Endpoint (POST /api/hashlists/detect-linked):
Request (multipart/form-data):
{
"file": <uploaded file>
}
Response (if both types found):
{
"has_both_types": true,
"lm_count": 1428,
"ntlm_count": 1500,
"blank_lm_count": 72
}
Response (if only one type):
{
"has_both_types": false
}
Design Decision: Detection is client-side initiated to provide immediate feedback without committing to upload.
Linked Hashlist Creation¶
When user chooses "Create Linked Lists":
- Upload Request: Frontend sends
create_linked=trueparameter - Hashlist Creation:
- Create LM hashlist:
{original_name}-LM(hash_type_id: 3000) - Create NTLM hashlist:
{original_name}-NTLM(hash_type_id: 1000) - Hashlist Link: Insert record into
linked_hashliststable - Processing: Both hashlists enter processing queue independently
- Hash Linking: After processing completes, create individual hash-to-hash links
API Endpoint (POST /api/hashlists):
Parameters:
- name: Original hashlist name
- hash_type_id: (ignored if create_linked=true)
- client_id: Optional client association
- file: Pwdump format file
- create_linked: "true" to enable linked creation
Processing Flow:
1. Create LM hashlist record
2. Create NTLM hashlist record
3. Create linked_hashlists entry (lm_id, ntlm_id, 'lm_ntlm')
4. Enqueue LM hashlist for processing
5. Enqueue NTLM hashlist for processing
6. (Background) Process LM hashes
7. (Background) Process NTLM hashes
8. (Background) Create hash-to-hash links
Processing Pipeline¶
Hashlist Processing¶
Standard Processing (non-LM): 1. Read file line by line 2. Extract hash values and metadata 3. Batch insert into hashes table 4. Create hashlist_hashes join entries
LM-Specific Processing: 1. Read file line by line 2. Extract LM hash (32 hex chars) 3. Skip blank LM constant: If hash equals aad3b435b51404eeaad3b435b51404ee, skip line 4. Store full 32-char hash in hashes.hash_value 5. Create lm_hash_metadata entry (all fields FALSE/NULL initially) 6. Create hashlist_hashes join entry
Code Location: backend/internal/processor/hashlist_processor.go
Blank LM Filtering Logic:
if hashType.ID == 3000 {
upperHashValue := strings.ToUpper(hashValue)
if upperHashValue == "AAD3B435B51404EEAAD3B435B51404EE" {
debug.Debug("[Processor:%d] Line %d: Skipping blank LM hash", hashlistID, lineNumber)
totalHashes-- // Don't count blank LM hashes
continue
}
}
Hash-to-Hash Linking¶
After both linked hashlists complete processing:
- Retrieve Hashes: Get all hashes from both hashlists with username/domain
- Build NTLM Map:
map[string]*models.Hashkeyed by{domain}\{username} - Match LM to NTLM: For each LM hash, lookup NTLM hash by username/domain
- Batch Insert Links: Create
linked_hashesentries for all matches
Matching Logic:
func makeUserDomainKey(username, domain *string) string {
user := ""
if username != nil {
user = *username
}
dom := ""
if domain != nil {
dom = *domain
}
if dom != "" {
return fmt.Sprintf("%s\\%s", dom, user)
}
return user
}
Batch Linking:
INSERT INTO linked_hashes (hash_id_1, hash_id_2, link_type)
VALUES
($1, $2, 'lm_ntlm'),
($3, $4, 'lm_ntlm'),
...
ON CONFLICT (hash_id_1, hash_id_2) DO NOTHING;
Design Decision: Links created by username/domain match, not by RID, to handle domain migrations and account renames.
Agent Download¶
Standard Hash Download¶
For most hash types, agents download via GET /api/hashlists/{id}/uncracked:
Response (text/plain):
5f4dcc3b5cd84097a65d1633f5c74f5e
098f6bcd4621d373cade4e832627b4f6
1a1dc91c907325c69271ddf0c944bc72
...
LM Hash Half Streaming¶
For LM hashlists (hash_type_id 3000), special processing occurs:
Backend Processing (routes/hashlist.go):
if hashlist.HashTypeID == 3000 {
// Stream unique 16-char halves instead of full 32-char hashes
err = h.hashRepo.StreamUncrackedLMHashHalvesForHashlist(ctx, hashlist.ID, func(hashHalf string) error {
fmt.Fprintln(w, hashHalf) // Write 16-char half
return nil
})
}
SQL Query (repository/hash_repository.go):
SELECT DISTINCT half
FROM (
SELECT SUBSTRING(h.hash_value, 1, 16) AS half
FROM hashes h
INNER JOIN hashlist_hashes hh ON h.id = hh.hash_id
WHERE hh.hashlist_id = $1 AND h.is_cracked = FALSE
UNION
SELECT SUBSTRING(h.hash_value, 17, 16) AS half
FROM hashes h
INNER JOIN hashlist_hashes hh ON h.id = hh.hash_id
WHERE hh.hashlist_id = $1 AND h.is_cracked = FALSE
) AS halves
ORDER BY half
Example Output:
01FC5A6BE7BC6929 ← First half of hash 1
5F4DCC3B5CD84097 ← First half of hash 2
AAD3B435B51404EE ← Blank constant (appears once despite multiple occurrences)
C3B435B51404EE89 ← Second half of hash 1
...
Why This Approach: - Hashcat Requirement: Mode 3000 expects 16-char halves, not 32-char full hashes - Deduplication: DISTINCT ensures common halves appear only once - Efficiency: Blank constant aad3b435b51404ee sent once instead of hundreds of times - Parallel Capability: Agents can crack different halves simultaneously
Crack Handling¶
LM Partial Crack Flow¶
When an agent reports a cracked LM hash half:
- Agent Reports Crack: Sends 16-char hash half + password to backend
- Identify Full Hashes: Find all 32-char LM hashes containing this 16-char half
- Determine Position: Check if half matches LEFT(hash, 16) or RIGHT(hash, 16)
- Update Metadata:
- If first half: Set
first_half_cracked = TRUE,first_half_password = <password> - If second half: Set
second_half_cracked = TRUE,second_half_password = <password> - Check Completion: If both halves now cracked, assemble full password
- Mark Complete: If both halves cracked, update
hashes.is_cracked = TRUE
Repository Method (repository/lm_hash_repository.go):
func (r *LMHashRepository) UpdateLMHalfCrack(ctx context.Context, tx *sql.Tx, hashID uuid.UUID, halfPosition string, password string) error {
// halfPosition: "first" or "second"
query := `
INSERT INTO lm_hash_metadata (hash_id, {half}_cracked, {half}_password, updated_at)
VALUES ($1, TRUE, $2, $3)
ON CONFLICT (hash_id) DO UPDATE
SET {half}_cracked = TRUE, {half}_password = $2, updated_at = $3
`
// ...
}
Full Password Assembly:
func (r *LMHashRepository) CheckAndFinalizeLMCrack(ctx context.Context, tx *sql.Tx, hashID uuid.UUID) (bool, string, error) {
// Check if both halves are cracked
query := `
SELECT (first_half_cracked AND second_half_cracked) AS both_cracked,
first_half_password, second_half_password
FROM lm_hash_metadata
WHERE hash_id = $1
`
if bothCracked {
fullPassword = firstHalfPwd + secondHalfPwd
return true, fullPassword, nil
}
return false, "", nil
}
Cross-Hashlist Propagation¶
LM hash cracks propagate across all hashlists (standard behavior):
- Crack Reported: Agent cracks 16-char LM half
- Find All Matching: Identify all 32-char LM hashes containing this half
- Update All: Update metadata for every matching hash
- Regenerate Files: Regenerate all affected hashlist files
- Notify Agents: Mark agent copies as outdated
This ensures that cracking one LM half benefits all hashlists containing hashes with that half.
Analytics Integration¶
Windows Hash Statistics¶
Overview Count Calculation:
-- Get effective count (linked pairs count as ONE)
SELECT
COUNT(DISTINCT CASE
WHEN lh.id IS NOT NULL THEN
CASE WHEN h.hash_type_id = 3000 THEN lh.id ELSE NULL END
ELSE h.id
END) AS total_windows,
COUNT(DISTINCT CASE
WHEN h.is_cracked AND lh.id IS NOT NULL THEN
CASE WHEN h.hash_type_id = 3000 THEN lh.id ELSE NULL END
ELSE CASE WHEN h.is_cracked THEN h.id ELSE NULL END
END) AS cracked_windows
FROM hashes h
LEFT JOIN linked_hashes lh ON (h.id = lh.hash_id_1 OR h.id = lh.hash_id_2)
AND lh.link_type = 'lm_ntlm'
WHERE ...
Individual Hash Type Counts: - Use raw counts (don't adjust for linking) to show actual hash quantities - Example: 1500 NTLM hashes and 1428 LM hashes displayed separately
Linked Pair Count:
Linked Hash Correlation¶
Query Structure:
SELECT
COUNT(*) AS total_pairs,
COUNT(CASE WHEN lm.is_cracked AND ntlm.is_cracked THEN 1 END) AS both_cracked,
COUNT(CASE WHEN NOT lm.is_cracked AND ntlm.is_cracked THEN 1 END) AS only_ntlm,
COUNT(CASE WHEN lm.is_cracked AND NOT ntlm.is_cracked THEN 1 END) AS only_lm,
COUNT(CASE WHEN NOT lm.is_cracked AND NOT ntlm.is_cracked THEN 1 END) AS neither
FROM linked_hashes lh
INNER JOIN hashes lm ON lh.hash_id_1 = lm.id
INNER JOIN hashes ntlm ON lh.hash_id_2 = ntlm.id
WHERE lh.link_type = 'lm_ntlm' AND ...
LM Partial Crack Query¶
Find Partially Cracked LM Hashes:
SELECT
h.id, h.username, h.domain,
lm.first_half_cracked, lm.first_half_password,
lm.second_half_cracked, lm.second_half_password,
hl.name AS hashlist_name
FROM lm_hash_metadata lm
INNER JOIN hashes h ON lm.hash_id = h.id
INNER JOIN hashlist_hashes hlh ON h.id = hlh.hash_id
INNER JOIN hashlists hl ON hlh.hashlist_id = hl.id
WHERE (lm.first_half_cracked OR lm.second_half_cracked)
AND NOT (lm.first_half_cracked AND lm.second_half_cracked)
AND hlh.hashlist_id = ANY($1)
ORDER BY h.username
LIMIT 50;
Performance Considerations¶
Index Strategy¶
Critical Indexes: 1. idx_linked_hashlists_id2: Enables bidirectional hashlist lookup 2. idx_linked_hashes_id2: Enables bidirectional hash lookup 3. idx_lm_metadata_crack_status: Fast partial crack queries 4. idx_lm_metadata_hash_id: Foreign key lookup
Query Optimization: - Composite index on (first_half_cracked, second_half_cracked) enables single-scan partial crack detection - DISTINCT in LM half streaming handled by PostgreSQL with UNION optimization
Memory Usage¶
LM Half Streaming: - No full dataset loaded into memory - Cursor-based streaming from database - Backpressure via HTTP chunked transfer encoding - Typical memory: <100MB for 1M+ hashes
Hash Linking: - In-memory map: map[string]*models.Hash for NTLM hashes - Typical size: ~200 bytes per hash × count - Example: 100K hashes = ~20MB - Batch insert: 1000 links at a time to limit transaction size
Scalability¶
Tested Performance: - Pwdump files up to 1M lines: <30 seconds processing - Hash linking 100K pairs: <5 seconds - Analytics with linked pairs: <10 seconds for 1M+ hashes - LM half streaming: Line-speed (network bound, not CPU/DB bound)
Future Extensibility¶
The generic design enables future enhancements:
Potential Link Types: - sha1_ntlm: Link SHA1 and NTLM hashes for same user (multi-platform analysis) - old_new: Link old and new password hashes for password change analysis - service_user: Link service account hashes across systems
Metadata Tables: - Similar to lm_hash_metadata, could add: - kerberos_metadata: etype information, ticket details - netntlm_metadata: challenge/response pair tracking - custom_metadata: User-defined fields for special analyses
Analytics Extensions: - Password aging analysis (old_new links) - Cross-platform password reuse (sha1_ntlm links) - Service account proliferation tracking
Troubleshooting¶
Common Issues¶
Issue: Hash links not created after upload - Cause: Username/domain mismatch between LM and NTLM entries - Solution: Verify username extraction logic handles special characters - Check: SELECT username, domain FROM hashes WHERE hashlist_id IN (...)
Issue: Partial cracks not appearing in analytics - Cause: lm_hash_metadata entries not created during processing - Solution: Verify LM hashlist has hash_type_id = 3000 - Check: SELECT COUNT(*) FROM lm_hash_metadata WHERE hash_id IN (...)
Issue: Duplicate links created - Cause: Bidirectional uniqueness constraint prevents this, but check for manual SQL - Solution: Constraints automatically prevent duplicates
Issue: Analytics show wrong linked pair count - Cause: May be counting hashlist links instead of hash links - Solution: Verify query uses linked_hashes not linked_hashlists
Debugging Queries¶
Check Hashlist Linkage:
Check Hash Linkage:
Find Orphaned Metadata:
SELECT lm.* FROM lm_hash_metadata lm
LEFT JOIN hashes h ON lm.hash_id = h.id
WHERE h.id IS NULL;
-- Should return 0 rows (CASCADE DELETE should prevent orphans)
Verify LM Half Streaming:
# Download LM hashlist, count unique halves
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:8080/api/hashlists/{id}/uncracked | sort -u | wc -l