โ† Back to Docs integration

PLAN OF PLANS

Documentation for PLAN_OF_PLANS from the Ds ex repository.

DSPEx Master Development Plan of Plans

Strategic 50,000-foot Roadmap for DSPEx Integration & Variable System Implementation

Executive Summary

DSPEx is positioned to become the most advanced prompt engineering framework by combining:

  1. 11 Comprehensive Integration Documents (65% complete foundation)
  2. Revolutionary Variable System (adaptive optimization across adapters/modules/parameters)
  3. Elixir-Native Excellence (BEAM VM strengths: concurrency, fault tolerance, hot reloading)

This plan orchestrates the careful, layered development of these interdependent systems with strategic re-visits as we “can’t know till we get there.”

Current State Assessment

โœ… Solid Foundation (Completed)

  • Core SIMBA Algorithm: Functional teleprompter optimization
  • Basic Program Behavior: Working forward/3 callback pattern
  • Signature System: Functional with enhanced parsing
  • 11 Integration Documents: Comprehensive blueprints (1,400+ pages)
  • Testing Infrastructure: Mox-based with MockHelpers

๐Ÿ”„ Variable System Status (Preliminary/Iterative)

  • Two Competing Approaches: Cursor (practical) vs Claude Code (architectural)
  • Core Innovation: Universal variable abstraction for any optimizer
  • Target Problem: Auto-discovery of optimal adapter/module/parameter combinations
  • Implementation Status: Design-complete, implementation-pending

๐Ÿ“Š Gap Analysis

  • Missing Critical Integrations: 7 documents (RETRIEVE, TELEPROMPT, STREAMING, etc.)
  • Behavior Definitions: Missing Client, Adapter, Evaluator behaviors
  • Variable Integration Points: Need careful threading through existing systems
  • Testing Coverage: Need comprehensive validation across all systems

Strategic Architecture Overview

DSPEx Master Architecture (Post-Integration)
โ”œโ”€โ”€ ๐ŸŽฏ Foundation Layer
โ”‚   โ”œโ”€โ”€ Core Primitives (โœ… Enhanced Multi-Language Execution)
โ”‚   โ”œโ”€โ”€ Behaviors (๐Ÿ”„ Missing Client/Adapter/Evaluator behaviors)
โ”‚   โ””โ”€โ”€ Program Interface (โœ… Working, needs streaming/concurrency)
โ”œโ”€โ”€ ๐Ÿ”Œ Adapter & Client Layer  
โ”‚   โ”œโ”€โ”€ Adapters (โœ… Pattern Matching + Protocols)
โ”‚   โ”œโ”€โ”€ Clients (โœ… ExLLM Migration Strategy)
โ”‚   โ””โ”€โ”€ Variable Integration (๐Ÿšง Automatic adapter selection)
โ”œโ”€โ”€ ๐Ÿ”ฎ Prediction & Processing Layer
โ”‚   โ”œโ”€โ”€ Predict (โœ… Streaming + Concurrency design)
โ”‚   โ”œโ”€โ”€ Streaming (โŒ Missing GenStage integration)
โ”‚   โ””โ”€โ”€ Variable Integration (๐Ÿšง Module selection automation)
โ”œโ”€โ”€ ๐Ÿ“Š Evaluation & Optimization Layer
โ”‚   โ”œโ”€โ”€ Evaluation (โœ… Nx-Powered Analytics)
โ”‚   โ”œโ”€โ”€ Teleprompters (๐Ÿ”„ SIMBA working, others missing)
โ”‚   โ”œโ”€โ”€ Variable System (๐Ÿšง Universal optimization framework)
โ”‚   โ””โ”€โ”€ Adaptive Selection (๐Ÿšง Multi-objective optimization)
โ”œโ”€โ”€ ๐Ÿ” Data & Retrieval Layer
โ”‚   โ”œโ”€โ”€ Retrieval (โŒ Missing 20+ backend integration)
โ”‚   โ”œโ”€โ”€ Datasets (โŒ Missing integration)
โ”‚   โ””โ”€โ”€ Utils (โŒ Missing cross-cutting utilities)
โ””โ”€โ”€ ๐Ÿงช Advanced & Experimental Layer
    โ”œโ”€โ”€ Experimental (โŒ Missing research features)
    โ”œโ”€โ”€ Propose (โŒ Missing proposal generation)
    โ””โ”€โ”€ Variable Extensions (๐Ÿšง Future optimization strategies)

Master Development Strategy

๐ŸŽฏ Phase 1: Foundation Consolidation (4-6 weeks)

“Stabilize the base before building the tower”

1.1 Critical Behavior Definitions (Week 1)

Priority: ๐Ÿ”ด Blocking - Required for all subsequent work

Scope: Define missing behaviors that enable contract-first development

  • DSPEx.Client behavior (CRITICAL - blocks client layer testing)
  • DSPEx.Adapter behavior (CRITICAL - blocks adapter selection)
  • DSPEx.Evaluator behavior (HIGH - blocks evaluation system)

Why Now: Without these behaviors, we cannot mock or test the variable system integration points.

Deliverables:

# New behavior definitions
lib/dspex/client.ex     # @callback request/2, request/3
lib/dspex/adapter.ex    # @callback format_messages/2, parse_response/2  
lib/dspex/evaluate.ex   # @callback run/4, run_local/4, run_distributed/4

Variable System Impact: Enables mocking of adapter selection and evaluation during variable optimization.

1.2 Critical Missing Integrations (Weeks 2-4)

Priority: ๐Ÿ”ด Tier 1 - Foundation dependencies

RETRIEVE_INTEGRATION.md (Week 2)

  • Scope: 20+ retrieval backends, vector databases, embedding systems
  • Variable Integration: Retrieval method selection as discrete variable
  • Dependencies: ExLLM client integration, Nx embedding operations
  • Impact: Enables RAG systems, knowledge base integration

TELEPROMPT_INTEGRATION.md (Week 3)

  • Scope: 10+ optimization algorithms beyond SIMBA
  • Variable Integration: Optimization strategy selection as discrete variable
  • Dependencies: Existing SIMBA foundation, evaluation system
  • Impact: Completes teleprompter ecosystem for variable system

STREAMING_INTEGRATION.md (Week 4)

  • Scope: GenStage integration, real-time processing, reactive streams
  • Variable Integration: Streaming strategy selection, backpressure configuration
  • Dependencies: Enhanced Program behavior, GenStage/Flow
  • Impact: Enables real-time variable optimization feedback

1.3 Foundation Testing & Validation (Weeks 5-6)

Scope: Comprehensive testing of integrated foundation

  • Integration tests across all behaviors
  • Performance benchmarking vs. DSPy
  • Contract validation with Mox
  • Foundation stress testing

๐ŸŽฏ Phase 2: Variable System Integration (6-8 weeks)

“Carefully threading the variable system through existing architecture”

2.1 Variable System Architecture Decision (Week 7)

Critical Decision Point: Choose between Cursor vs Claude Code approaches

Evaluation Criteria:

  • Implementation Speed: Cursor approach more direct
  • Long-term Extensibility: Claude Code approach more modular
  • Integration Complexity: How well each threads through existing systems
  • Testing Requirements: Which approach enables better testing

Recommended Hybrid Strategy:

# Combine strengths of both approaches
โ”œโ”€โ”€ Variable Abstraction (Cursor: Simple, practical)
โ”œโ”€โ”€ Multi-Objective Evaluation (Claude Code: Sophisticated)
โ”œโ”€โ”€ Universal Optimizer (Cursor: Direct implementation)
โ””โ”€โ”€ Continuous Learning (Claude Code: Advanced architecture)

2.2 Core Variable System Implementation (Weeks 8-10)

Scope: Implement chosen variable system architecture

Week 8: Variable abstraction and basic optimization

# Core variable system
lib/dspex/variables/variable.ex
lib/dspex/variables/variable_space.ex
lib/dspex/variables/configuration_manager.ex

Week 9: Multi-objective evaluation integration

# Evaluation integration
lib/dspex/variables/evaluators/
โ”œโ”€โ”€ accuracy_evaluator.ex
โ”œโ”€โ”€ cost_evaluator.ex
โ”œโ”€โ”€ latency_evaluator.ex
โ””โ”€โ”€ reliability_evaluator.ex

Week 10: Universal optimizer implementation

# Optimization strategies
lib/dspex/variables/optimizers/
โ”œโ”€โ”€ universal_optimizer.ex
โ”œโ”€โ”€ bayesian_optimizer.ex
โ”œโ”€โ”€ genetic_optimizer.ex
โ””โ”€โ”€ grid_search_optimizer.ex

2.3 Integration Point Threading (Weeks 11-12)

Scope: Carefully integrate variable system with existing components

Adapter Integration:

  • Thread adapter selection through variable system
  • Enable automatic JSON vs Markdown vs Chat selection
  • Maintain backward compatibility

Module Integration:

  • Enable Predict vs CoT vs PoT automatic selection
  • Thread through program behavior system
  • Preserve existing module interfaces

Evaluation Integration:

  • Connect variable evaluation to existing Nx-powered metrics
  • Enable multi-objective optimization
  • Maintain evaluation system performance

2.4 Variable System Testing & Validation (Weeks 13-14)

Scope: Comprehensive testing of variable integration

  • Unit tests for all variable components
  • Integration tests with existing systems
  • Property-based testing with StreamData
  • Performance benchmarking of optimization

๐ŸŽฏ Phase 3: Advanced Integration Completion (4-6 weeks)

“Complete the ecosystem with advanced features”

3.1 Essential Infrastructure (Weeks 15-16)

DATASETS_INTEGRATION.md

  • Built-in datasets, preprocessing pipelines
  • Variable integration: Dataset selection and preprocessing strategies

UTILS_INTEGRATION.md

  • Cross-cutting utilities, telemetry integration
  • Variable integration: Utility configuration optimization

3.2 Advanced Features (Weeks 17-18)

EXPERIMENTAL_INTEGRATION.md

  • Cutting-edge research features, experimental algorithms
  • Variable integration: Experimental strategy selection

PROPOSE_INTEGRATION.md

  • Proposal generation systems, advanced reasoning
  • Variable integration: Proposal strategy optimization

3.3 System Optimization & Polish (Weeks 19-20)

Scope: Performance optimization and production readiness

  • Performance tuning across all integrated systems
  • Memory optimization for variable space exploration
  • Production deployment strategies
  • Documentation completion

Critical Interdependencies & Re-visit Strategy

๐Ÿ”„ “Can’t Know Till We Get There” Points

1. Variable System Architecture Impact

Unknown: How variable abstraction affects existing program performance Re-visit Strategy: Implement basic version first, then optimize based on real performance data Triggers: Performance degradation > 10%, memory usage > 2x baseline

2. Multi-Objective Optimization Complexity

Unknown: Real-world performance of Pareto analysis with 4+ objectives Re-visit Strategy: Start with 2 objectives (accuracy, cost), add more based on results Triggers: Optimization time > 5 minutes, convergence issues

3. Elixir Concurrency vs. Python DSPy Performance

Unknown: How concurrent variable optimization compares to Python sequential Re-visit Strategy: Benchmark early and often, adjust concurrency strategies Triggers: Performance not meeting 2x Python DSPy baseline

4. Integration Point Friction

Unknown: How much existing code needs modification for variable integration Re-visit Strategy: Implement adapters/wrappers first, refactor core later if needed Triggers: Breaking changes > 20% of existing API surface

๐Ÿ”— Critical Dependency Chains

Chain 1: Behavior โ†’ Variable Testing

Behavior Definitions โ†’ Mock Implementation โ†’ Variable System Testing โ†’ Integration Validation

Risk: Cannot test variable system without proper behavior mocking Mitigation: Prioritize behavior definitions in Phase 1

Chain 2: Evaluation โ†’ Variable Optimization

Enhanced Evaluation โ†’ Multi-Objective Metrics โ†’ Variable Optimization โ†’ Automatic Selection

Risk: Variable optimization quality depends on evaluation sophistication Mitigation: Ensure evaluation system is robust before variable integration

Chain 3: Streaming โ†’ Real-time Variable Optimization

GenStage Integration โ†’ Streaming Evaluation โ†’ Real-time Variable Adjustment โ†’ Adaptive Systems

Risk: Real-time optimization requires streaming infrastructure Mitigation: Implement streaming early in Phase 1

Quality Assurance Strategy

๐Ÿงช Multi-Modal Testing Approach

Testing Per Phase

Phase 1: Foundation testing with comprehensive behavior validation Phase 2: Variable system testing with property-based validation Phase 3: End-to-end integration testing with performance validation

Continuous Validation

# Required checks for each phase
mix test --include group_1 --include group_2  # Comprehensive test suite
mix dialyzer                                  # Zero warnings required
mix credo --strict                           # Code quality validation
mix format --check-formatted                 # Code formatting

Performance Benchmarking

  • Baseline: Current DSPEx performance metrics
  • Target: 2x Python DSPy performance for equivalent operations
  • Variable System: Optimization time < 5 minutes for standard problems

๐Ÿ“Š Progress Tracking & Success Metrics

Phase 1 Success Criteria

  • All critical behaviors defined and implemented
  • 3 critical integration documents completed (RETRIEVE, TELEPROMPT, STREAMING)
  • Foundation performance maintains baseline
  • Comprehensive test coverage > 90%

Phase 2 Success Criteria

  • Variable system successfully integrated with existing architecture
  • Automatic adapter/module selection working
  • Multi-objective optimization functional
  • No breaking changes to existing API > 20%

Phase 3 Success Criteria

  • All 18 integration components completed
  • Variable system optimization performance meets targets
  • Production deployment ready
  • Documentation complete

Risk Management & Mitigation

๐Ÿšจ High-Risk Areas

1. Variable System Complexity

Risk: Variable abstraction too complex, impacts performance Mitigation: Start simple, iterate based on real usage Fallback: Implement simpler grid-search-only version first

2. Integration Point Conflicts

Risk: Variable system conflicts with existing architecture Mitigation: Use adapter pattern, maintain backward compatibility Fallback: Implement variable system as optional add-on

3. Performance Degradation

Risk: Comprehensive integration slows down basic operations Mitigation: Profile continuously, optimize hot paths Fallback: Make advanced features opt-in

4. Scope Creep

Risk: Variable system requirements grow beyond original scope Mitigation: Strict scope control, defer advanced features to Phase 4 Fallback: Ship basic variable system, iterate in subsequent releases

Resource Allocation & Timeline

๐Ÿ“… Realistic Timeline (16-20 weeks total)

Phase 1: 4-6 weeks (Foundation) Phase 2: 6-8 weeks (Variable System)
Phase 3: 4-6 weeks (Advanced Features) Buffer: 2-4 weeks (Unknown unknowns)

๐Ÿ‘ฅ Skill Requirements

Elixir Expertise: Advanced (GenServer, GenStage, OTP patterns) Machine Learning: Intermediate (Optimization algorithms, evaluation metrics) System Architecture: Advanced (Integration patterns, performance optimization) Testing: Advanced (Property-based testing, performance testing)

Success Vision

Upon completion, DSPEx will be:

  1. Most Advanced Optimization Framework: Automatic discovery of optimal configurations
  2. Elixir-Native Excellence: Leveraging BEAM VM strengths for superior performance
  3. Production Ready: Comprehensive testing, documentation, deployment strategies
  4. Community Leading: Setting new standards for prompt engineering frameworks

The variable system will solve the exact problem identified by the DSPy community while positioning DSPEx as the cutting-edge framework for automated prompt engineering optimization.


This plan of plans serves as our strategic roadmap, designed for careful iteration and re-visiting as we discover the realities of implementation. The layered approach ensures we build on solid foundations while remaining flexible enough to adapt as we learn.