DSPEX GAP ANALYSIS CORE 01

Documentation for DSPEX_GAP_ANALYSIS_CORE_01 from the Ds ex repository.

DSPEx Gap Analysis: Core Teleprompter Components

Overview

This document analyzes the gaps in DSPEx’s teleprompter/optimizer implementations compared to the original DSPy Python library. The analysis focuses on core algorithmic components that are either missing or incomplete.

🎯 SIMBA Teleprompter: Critical Algorithmic Gaps

Status: 60% Complete - Major Algorithmic Components Missing

Based on analysis of dspy/teleprompt/simba.py, DSPEx’s SIMBA implementation has excellent infrastructure but critical algorithmic gaps.

1. Program Selection Algorithm (CRITICAL BLOCKING ISSUE)

Python DSPy Implementation:

def calc_average_score(prog_idx: int) -> float:
    scores = program_scores.get(prog_idx, [])
    if not scores:
        return 0.0
    return sum(scores) / len(scores)

def softmax_sample(rng_obj: random.Random, program_idxs: list[int], temperature: float) -> int:
    # Unnormalized weights
    scores = [calc_average_score(idx) for idx in program_idxs]  # ✅ REAL SCORES
    exps = [np.exp(s / temperature) for s in scores]
    sum_exps = sum(exps)
    if sum_exps <= 0:
        return rng_obj.choice(program_idxs)
    
    # Weighted random choice
    probs = [val / sum_exps for val in exps]
    return rng_obj.choices(program_idxs, weights=probs, k=1)[0]

DSPEx Current (BROKEN):

defp softmax_sample(program_indices, _all_programs, temperature) do
  if is_list(program_indices) and length(program_indices) > 0 do
    scores = Enum.map(program_indices, fn _idx -> 0.5 end)  # ❌ FIXED SCORES!
    # This completely breaks the optimization algorithm
  end
end

Required Fix:

defp calculate_average_score(program_scores, prog_idx) do
  scores = Map.get(program_scores, prog_idx, [])
  if Enum.empty?(scores) do
    0.0
  else
    Enum.sum(scores) / length(scores)
  end
end

defp softmax_sample(program_indices, program_scores, temperature) do
  scores = Enum.map(program_indices, fn idx -> 
    calculate_average_score(program_scores, idx)
  end)
  
  # Calculate exponentials
  exps = Enum.map(scores, fn score -> 
    :math.exp(score / temperature)
  end)
  
  sum_exps = Enum.sum(exps)
  
  if sum_exps <= 0 do
    Enum.random(program_indices)
  else
    # Weighted random selection
    probs = Enum.map(exps, fn exp -> exp / sum_exps end)
    weighted_random_choice(program_indices, probs)
  end
end

2. Program Pool Management (CRITICAL)

Python DSPy Implementation:

def top_k_plus_baseline(k: int) -> list[int]:
    # Sort all programs by descending average score
    scored_programs = sorted(programs, key=lambda p: calc_average_score(p.simba_idx), reverse=True)
    top_k = [p.simba_idx for p in scored_programs[:k]]
    # Ensure baseline=0 is in there:
    if 0 not in top_k and len(top_k) > 0:
        top_k[-1] = 0
    return list(dict.fromkeys(top_k))

DSPEx Status: ❌ MISSING ENTIRELY

Required Implementation:

defp select_top_programs_with_baseline(programs, program_scores, k) do
  # Calculate average scores for all programs
  program_avg_scores = programs
  |> Enum.with_index()
  |> Enum.map(fn {_program, idx} ->
    avg_score = calculate_average_score(program_scores, idx)
    {idx, avg_score}
  end)
  |> Enum.sort_by(fn {_idx, score} -> score end, :desc)
  
  # Take top k
  top_k_indices = program_avg_scores
  |> Enum.take(k)
  |> Enum.map(fn {idx, _score} -> idx end)
  
  # Ensure baseline (index 0) is included
  if 0 not in top_k_indices and length(top_k_indices) > 0 do
    # Replace last element with baseline
    List.replace_at(top_k_indices, -1, 0)
  else
    top_k_indices
  end
  |> Enum.uniq()
end

3. Main Optimization Loop Logic (INCOMPLETE)

Python DSPy Key Steps:

Mini-batch Selection ✅ DSPEx Complete
Model Preparation ✅ DSPEx Complete
Program Selection ❌ DSPEx Broken (fixed scores)
Trajectory Sampling ⚠️ DSPEx Over-complex but functional
Bucket Analysis ✅ DSPEx Complete and Superior
Strategy Application ⚠️ DSPEx Partial (only AppendDemo)
Candidate Evaluation ✅ DSPEx Complete
Program Registration ❌ DSPEx Missing logic
Winning Program Selection ⚠️ DSPEx Simplified

Missing Integration: The DSPEx loop exists but doesn’t properly integrate these steps with real scoring and selection logic.

4. Strategy System Gaps

Python DSPy Strategies:

if self.max_demos > 0:
    self.strategies = [append_a_demo(demo_input_field_maxlen), append_a_rule]
else:
    self.strategies = [append_a_rule]

# Random strategy selection
strategy = rng.choice(self.strategies)

DSPEx Status:

✅ AppendDemo strategy complete with Poisson sampling
❌ AppendRule strategy missing entirely
❌ Strategy selection logic incomplete

Required AppendRule Implementation:

defmodule DSPEx.Teleprompter.SIMBA.Strategy.AppendRule do
  @behaviour DSPEx.Teleprompter.SIMBA.Strategy
  
  @impl true
  def apply(bucket, program, _config, _context) do
    # Extract high-performing trajectories
    high_performance_trajectories = bucket.trajectories
    |> Enum.filter(&(&1.score > bucket.avg_score))
    |> Enum.take(3)
    
    if Enum.empty?(high_performance_trajectories) do
      {:ok, program}
    else
      # Generate rule from patterns in high-performing trajectories
      rule = generate_rule_from_trajectories(high_performance_trajectories)
      enhanced_program = add_rule_to_program(program, rule)
      {:ok, enhanced_program}
    end
  end
  
  defp generate_rule_from_trajectories(trajectories) do
    # Analyze common patterns in successful trajectories
    # This would use LLM to generate rules based on successful examples
    "Based on successful examples, ensure you follow this pattern: ..."
  end
  
  defp add_rule_to_program(program, rule) do
    # Add the generated rule to the program's instructions
    # Implementation depends on program structure
    program
  end
end