Harness Trace Learning - Technical Reference

Deep dive: Architecture, algorithms, and implementation details.

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                     Harness Evolution Run                        │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  Context Capture (Phase 0)                                       │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ build_evolution_context()                                 │   │
│  │ - Infer workflow type                                     │   │
│  │ - Aggregate gap categories                                │   │
│  │ - Extract changed paths                                   │   │
│  └──────────────────────────────────────────────────────────┘   │
│                              ↓                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ record_evolution_outcome()                                │   │
│  │ - Append to history.jsonl                                 │   │
│  │ - Include full context + provenance                       │   │
│  └──────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  Pattern Extraction (Phase 1)                                    │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ load_evolution_history()                                  │   │
│  │ - Parse history.jsonl                                     │   │
│  │ - Filter by success_rate >= 0.8                           │   │
│  └──────────────────────────────────────────────────────────┘   │
│                              ↓                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ detect_common_patterns()                                  │   │
│  │ - Group by gap category combinations                      │   │
│  │ - Find patterns with 3+ occurrences                       │   │
│  │ - Extract patch order consensus                           │   │
│  └──────────────────────────────────────────────────────────┘   │
│                              ↓                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ generate_playbook_candidates()                            │   │
│  │ - Build playbook with strategy                            │   │
│  │ - Add full provenance                                     │   │
│  │ - Calculate confidence score                              │   │
│  └──────────────────────────────────────────────────────────┘   │
│                              ↓                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ save_playbook()                                           │   │
│  │ - Write to docs/fitness/playbooks/*.json                  │   │
│  └──────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│  Runtime Integration (Phase 2)                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ load_playbooks_for_task()                                 │   │
│  │ - Read docs/fitness/playbooks/*.json                      │   │
│  │ - Filter by taskType                                      │   │
│  │ - Sort by confidence                                      │   │
│  └──────────────────────────────────────────────────────────┘   │
│                              ↓                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ find_matching_playbook()                                  │   │
│  │ - Try exact match first                                   │   │
│  │ - Fallback to fuzzy match (>= 50% overlap)                │   │
│  │ - Select highest weighted_score                           │   │
│  └──────────────────────────────────────────────────────────┘   │
│                              ↓                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ display_preflight_guidance()                              │   │
│  │ - Show match type (exact/partial)                         │   │
│  │ - Display recommended patch order                         │   │
│  │ - Show anti-patterns                                      │   │
│  └──────────────────────────────────────────────────────────┘   │
│                              ↓                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ reorder_patches_by_playbook()                             │   │
│  │ - Sort patches by learned order                           │   │
│  │ - Preserve unmatched patches at end                       │   │
│  └──────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ↓
                      Apply Patches (optimized)

Core Algorithms

1. Pattern Detection

File: crates/routa-cli/src/commands/harness/engineering/learning.rs

Function: detect_common_patterns()

pub fn detect_common_patterns(
    entries: &[EvolutionHistory],
    min_success_rate: f64,
) -> Vec<CommonPattern> {
    // Step 1: Filter successful runs
    let successful: Vec<&EvolutionHistory> = entries
        .iter()
        .filter(|e| e.success_rate >= min_success_rate)  // Default: 0.8
        .collect();
    
    if successful.len() < 3 {
        return Vec::new();  // Need at least 3 runs
    }
    
    // Step 2: Group by gap patterns
    let mut gap_pattern_groups: HashMap<String, Vec<&EvolutionHistory>> = HashMap::new();
    
    for entry in successful.iter() {
        if let Some(ref categories) = entry.gap_categories {
            let mut sorted_categories = categories.clone();
            sorted_categories.sort();
            let pattern_key = sorted_categories.join(",");  // e.g., "gap_a,gap_b"
            
            gap_pattern_groups
                .entry(pattern_key)
                .or_default()
                .push(entry);
        }
    }
    
    // Step 3: Find patterns appearing 3+ times
    gap_pattern_groups
        .into_iter()
        .filter(|(_, group)| group.len() >= 3)  // Minimum 3 occurrences
        .map(|(pattern, group)| CommonPattern {
            gap_categories: pattern.split(',').map(|s| s.to_string()).collect(),
            occurrence_count: group.len(),
            avg_success_rate: group.iter().map(|e| e.success_rate).sum::<f64>() 
                              / group.len() as f64,
            preferred_patch_order: extract_patch_order_consensus(&group),
        })
        .collect()
}

Complexity: O(n) where n = number of history entries

Thresholds:

Minimum success rate: 0.8 (80%)
Minimum occurrences: 3 runs

2. Patch Order Consensus

Function: extract_patch_order_consensus()

fn extract_patch_order_consensus(entries: &[&EvolutionHistory]) -> Vec<String> {
    // Count patch positions across all runs
    let mut patch_positions: BTreeMap<String, Vec<usize>> = BTreeMap::new();
    
    for entry in entries {
        for (idx, patch_id) in entry.patches_applied.iter().enumerate() {
            patch_positions
                .entry(patch_id.clone())
                .or_default()
                .push(idx);
        }
    }
    
    // Calculate average position for each patch
    let mut patches_with_avg: Vec<(String, f64)> = patch_positions
        .into_iter()
        .map(|(patch, positions)| {
            let avg = positions.iter().sum::<usize>() as f64 / positions.len() as f64;
            (patch, avg)
        })
        .collect();
    
    // Sort by average position
    patches_with_avg.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
    
    patches_with_avg.into_iter().map(|(patch, _)| patch).collect()
}

Example:

Run 1: [patch.A (pos 0), patch.B (pos 1), patch.C (pos 2)]
Run 2: [patch.A (pos 0), patch.C (pos 1), patch.B (pos 2)]
Run 3: [patch.A (pos 0), patch.B (pos 1), patch.C (pos 2)]

Average positions:
- patch.A: (0+0+0)/3 = 0.0
- patch.B: (1+2+1)/3 = 1.33
- patch.C: (2+1+2)/3 = 1.67

Result: [patch.A, patch.B, patch.C]

3. Fuzzy Matching

Function: find_matching_playbook()

pub fn find_matching_playbook<'a>(
    playbooks: &'a [PlaybookCandidate],
    gaps: &[HarnessEngineeringGap],
) -> Option<&'a PlaybookCandidate> {
    // Extract current gap categories
    let mut current_categories: Vec<String> = gaps
        .iter()
        .map(|g| g.category.clone())
        .collect();
    current_categories.sort();
    current_categories.dedup();
    
    // Step 1: Try exact match
    if let Some(exact) = playbooks.iter().find(|pb| {
        let mut playbook_pattern = pb.strategy.gap_patterns.clone();
        playbook_pattern.sort();
        playbook_pattern == current_categories
    }) {
        return Some(exact);
    }
    
    // Step 2: Fuzzy matching
    let mut best_match: Option<(&PlaybookCandidate, f64)> = None;
    
    for playbook in playbooks {
        // Calculate overlap
        let overlap_count = current_categories
            .iter()
            .filter(|cat| playbook.strategy.gap_patterns.contains(cat))
            .count();
        
        if overlap_count == 0 {
            continue;
        }
        
        // Overlap ratio
        let total_unique = current_categories.len().max(playbook.strategy.gap_patterns.len());
        let overlap_ratio = overlap_count as f64 / total_unique as f64;
        
        // Weighted score
        let weighted_score = overlap_ratio * playbook.confidence;
        
        // Threshold: >= 50% overlap
        if overlap_ratio >= 0.5 {
            if best_match.is_none() || weighted_score > best_match.unwrap().1 {
                best_match = Some((playbook, weighted_score));
            }
        }
    }
    
    best_match.map(|(pb, _)| pb)
}

Scoring:

weighted_score = (overlap_count / total_unique) * confidence

Example:
Playbook: ["gap_a", "gap_b"], confidence: 0.95
Current:  ["gap_a", "gap_b", "gap_c"]
Overlap:  2
Total:    max(2, 3) = 3
Ratio:    2/3 = 0.667
Score:    0.667 * 0.95 = 0.634

Data Schemas

EvolutionHistory

File: crates/routa-cli/src/commands/harness/engineering/types.rs

#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct EvolutionHistory {
    // Core metadata
    pub timestamp: String,              // RFC3339 format
    pub repo_root: String,              // Absolute path
    pub mode: String,                   // "auto-apply" | "evaluation"

    // Trace linking (Phase 0)
    pub session_id: Option<String>,     // Links to .routa/traces/

    // Task fingerprint
    pub task_type: Option<String>,      // "harness_evolution"
    pub workflow: Option<String>,       // "bootstrap" | "auto-apply" | "evaluation"
    pub trigger: Option<String>,        // "manual" | "automation" | "ci"

    // Evidence bundle
    pub gaps_detected: Option<usize>,
    pub gap_categories: Option<Vec<String>>,
    pub changed_paths: Option<Vec<String>>,

    // Outcome
    pub patches_applied: Vec<String>,
    pub patches_failed: Vec<String>,
    pub success_rate: f64,              // 0.0 to 1.0

    // Failure context
    pub rollback_reason: Option<String>,
    pub error_messages: Option<Vec<String>>,
}

Serialization: JSON with camelCase field names

PlaybookCandidate

#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct PlaybookCandidate {
    pub id: String,                     // Unique identifier
    pub task_type: String,              // "harness_evolution"
    pub confidence: f64,                // 0.0 to 1.0
    pub strategy: PlaybookStrategy,
    pub provenance: PlaybookProvenance,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct PlaybookStrategy {
    pub preferred_patch_order: Vec<String>,
    pub gap_patterns: Vec<String>,
    pub anti_patterns: Vec<AntiPattern>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct PlaybookProvenance {
    pub source_runs: Vec<String>,       // Timestamps
    pub success_rate: f64,              // Average across source runs
    pub evidence_count: usize,          // Number of source runs
}

File Formats

history.jsonl

Format: JSONL (JSON Lines)

One JSON object per line
Append-only (preserves history)
No trailing comma

Example:

{"timestamp":"2026-04-06T01:29:43Z","repoRoot":"/path/to/repo","mode":"auto-apply","taskType":"harness_evolution","workflow":"bootstrap","trigger":"manual","gapsDetected":2,"gapCategories":["missing_governance_gate","missing_execution_surface"],"changedPaths":[".github/CODEOWNERS","docs/harness/build.yml"],"patchesApplied":["patch.create_codeowners","bootstrap.synthesize_build_yml"],"patchesFailed":[],"successRate":1.0}
{"timestamp":"2026-04-06T02:15:22Z","repoRoot":"/path/to/repo","mode":"auto-apply","taskType":"harness_evolution","workflow":"bootstrap","trigger":"manual","gapsDetected":2,"gapCategories":["missing_governance_gate","missing_execution_surface"],"changedPaths":[".github/CODEOWNERS","docs/harness/build.yml"],"patchesApplied":["patch.create_codeowners","bootstrap.synthesize_build_yml"],"patchesFailed":[],"successRate":1.0}

Parsing:

let content = fs::read_to_string("history.jsonl")?;
for line in content.lines() {
    if line.trim().is_empty() { continue; }
    let entry: EvolutionHistory = serde_json::from_str(line)?;
    // Process entry
}

playbook-*.json

Format: JSON (pretty-printed)

Single JSON object per file
One file per playbook
Named: {task-type}-{gap-pattern}.json

Example:

{
  "id": "harness-evolution-missing-governance",
  "taskType": "harness_evolution",
  "confidence": 0.95,
  "strategy": {
    "preferredPatchOrder": [
      "patch.create_codeowners",
      "patch.create_dependabot"
    ],
    "gapPatterns": [
      "missing_governance_gate"
    ],
    "antiPatterns": []
  },
  "provenance": {
    "sourceRuns": [
      "2026-04-06T01:29:43Z",
      "2026-04-06T02:15:22Z",
      "2026-04-07T10:30:15Z"
    ],
    "successRate": 0.95,
    "evidenceCount": 3
  }
}

Performance Characteristics

Pattern Detection

Time Complexity: O(n + m log m)

n = number of history entries
m = number of unique gap patterns

Space Complexity: O(n)

Stores all successful runs in memory

Bottleneck: JSONL parsing (I/O bound)

Optimization: Could add indexing for large history files (>10k entries)

Playbook Matching

Time Complexity: O(p * g)

p = number of playbooks (typically < 10)
g = number of current gaps (typically < 50)

Space Complexity: O(p)

Optimization: Already optimal for expected data sizes

Patch Reordering

Time Complexity: O(n log n)

n = number of patches (typically < 20)
Standard sorting algorithm

Space Complexity: O(n)

Optimization: Already optimal

Configuration

Thresholds (Hardcoded)

Pattern Detection:

Minimum success rate: 0.8 (80%)
Minimum occurrences: 3 runs

Fuzzy Matching:

Minimum overlap: 0.5 (50%)

Future: These could be made configurable via CLI flags or config file.

File Paths

const HISTORY_FILE: &str = "docs/fitness/evolution/history.jsonl";
const PLAYBOOKS_DIR: &str = "docs/fitness/playbooks";

Extension Points

Custom Playbook Generators

Current: Automatic generation from history

Extension: Allow custom playbook generators

pub trait PlaybookGenerator {
    fn generate(&self, history: &[EvolutionHistory]) -> Vec<PlaybookCandidate>;
}

// Example: AI-powered generator
struct AIPlaybookGenerator {
    model: String,
}

impl PlaybookGenerator for AIPlaybookGenerator {
    fn generate(&self, history: &[EvolutionHistory]) -> Vec<PlaybookCandidate> {
        // Call AI model to analyze patterns
        // Return AI-suggested playbooks
    }
}

Custom Matching Strategies

Current: Exact + fuzzy matching

Extension: Pluggable matchers

pub trait PlaybookMatcher {
    fn find_match<'a>(
        &self,
        playbooks: &'a [PlaybookCandidate],
        gaps: &[HarnessEngineeringGap],
    ) -> Option<&'a PlaybookCandidate>;
}

// Example: ML-based matcher
struct MLMatcher {
    model_path: PathBuf,
}

impl PlaybookMatcher for MLMatcher {
    fn find_match<'a>(...) -> Option<&'a PlaybookCandidate> {
        // Use ML model to predict best playbook
    }
}

Playbook Validation

Future: Validate playbooks before use

pub trait PlaybookValidator {
    fn validate(&self, playbook: &PlaybookCandidate) -> Result<(), String>;
}

// Example: Schema validator
struct SchemaValidator;

impl PlaybookValidator for SchemaValidator {
    fn validate(&self, playbook: &PlaybookCandidate) -> Result<(), String> {
        if playbook.confidence < 0.0 || playbook.confidence > 1.0 {
            return Err("Confidence must be between 0.0 and 1.0".to_string());
        }
        // ... more validation
        Ok(())
    }
}

Testing Strategy

Unit Tests (9 tests)

Pattern Detection:

test_load_evolution_history - Parse JSONL
test_detect_common_patterns - Group & filter
test_generate_playbook_candidates - Build playbooks

Runtime Integration:

test_load_playbooks_for_task - Deserialize JSON
test_find_matching_playbook - Exact match
test_fuzzy_matching_playbook - Partial match
test_no_match_when_overlap_too_low - Threshold

Utilities:

test_save_playbook - Serialize & write
test_reorder_patches_by_playbook - Sorting

Integration Tests

Manual Validation:

Run harness evolve 3 times
Generate playbook with --learn
Verify playbook file exists
Run harness evolve again
Verify playbook loaded and guidance displayed

Property-Based Tests (Future)

#[quickcheck]
fn prop_fuzzy_match_is_reflexive(playbook: PlaybookCandidate) -> bool {
    // A playbook should always match itself
    let gaps = playbook.strategy.gap_patterns
        .iter()
        .map(|cat| create_gap(cat))
        .collect();

    find_matching_playbook(&[playbook.clone()], &gaps).is_some()
}

Core Modules

crates/routa-cli/src/commands/harness/engineering/learning.rs (310 lines)
- Pattern detection
- Playbook generation
- Runtime loading
- Fuzzy matching
crates/routa-cli/src/commands/harness/engineering/mod.rs (+190 lines)
- Integration with harness evolve
- Context capture
- History recording
crates/routa-cli/src/commands/harness/engineering/types.rs (+20 lines)
- EvolutionHistory schema
- EvolutionContext

Tests

crates/routa-cli/src/commands/harness/engineering/tests_learning.rs (300 lines)
- 9 comprehensive tests
- Covers all major code paths

Architecture Overview​

Core Algorithms​

1. Pattern Detection​

2. Patch Order Consensus​

3. Fuzzy Matching​

Data Schemas​

EvolutionHistory​

PlaybookCandidate​

File Formats​

history.jsonl​

playbook-*.json​

Performance Characteristics​

Pattern Detection​

Playbook Matching​

Patch Reordering​

Configuration​

Thresholds (Hardcoded)​

File Paths​

Extension Points​

Custom Playbook Generators​

Custom Matching Strategies​

Playbook Validation​

Testing Strategy​

Unit Tests (9 tests)​

Integration Tests​

Property-Based Tests (Future)​

Related Code​

Core Modules​

Tests​

References​

Architecture Overview

Core Algorithms

1. Pattern Detection

2. Patch Order Consensus

3. Fuzzy Matching

Data Schemas

EvolutionHistory

PlaybookCandidate

File Formats

history.jsonl

playbook-*.json

Performance Characteristics

Pattern Detection

Playbook Matching

Patch Reordering

Configuration

Thresholds (Hardcoded)

File Paths

Extension Points

Custom Playbook Generators

Custom Matching Strategies

Playbook Validation

Testing Strategy

Unit Tests (9 tests)

Integration Tests

Property-Based Tests (Future)

Related Code

Core Modules

Tests

References