DSPEx Master Development Plan of Plans
Strategic 50,000-foot Roadmap for DSPEx Integration & Variable System Implementation
Executive Summary
DSPEx is positioned to become the most advanced prompt engineering framework by combining:
- 11 Comprehensive Integration Documents (65% complete foundation)
- Revolutionary Variable System (adaptive optimization across adapters/modules/parameters)
- Elixir-Native Excellence (BEAM VM strengths: concurrency, fault tolerance, hot reloading)
This plan orchestrates the careful, layered development of these interdependent systems with strategic re-visits as we “can’t know till we get there.”
Current State Assessment
โ Solid Foundation (Completed)
- Core SIMBA Algorithm: Functional teleprompter optimization
- Basic Program Behavior: Working forward/3 callback pattern
- Signature System: Functional with enhanced parsing
- 11 Integration Documents: Comprehensive blueprints (1,400+ pages)
- Testing Infrastructure: Mox-based with MockHelpers
๐ Variable System Status (Preliminary/Iterative)
- Two Competing Approaches: Cursor (practical) vs Claude Code (architectural)
- Core Innovation: Universal variable abstraction for any optimizer
- Target Problem: Auto-discovery of optimal adapter/module/parameter combinations
- Implementation Status: Design-complete, implementation-pending
๐ Gap Analysis
- Missing Critical Integrations: 7 documents (RETRIEVE, TELEPROMPT, STREAMING, etc.)
- Behavior Definitions: Missing Client, Adapter, Evaluator behaviors
- Variable Integration Points: Need careful threading through existing systems
- Testing Coverage: Need comprehensive validation across all systems
Strategic Architecture Overview
DSPEx Master Architecture (Post-Integration)
โโโ ๐ฏ Foundation Layer
โ โโโ Core Primitives (โ
Enhanced Multi-Language Execution)
โ โโโ Behaviors (๐ Missing Client/Adapter/Evaluator behaviors)
โ โโโ Program Interface (โ
Working, needs streaming/concurrency)
โโโ ๐ Adapter & Client Layer
โ โโโ Adapters (โ
Pattern Matching + Protocols)
โ โโโ Clients (โ
ExLLM Migration Strategy)
โ โโโ Variable Integration (๐ง Automatic adapter selection)
โโโ ๐ฎ Prediction & Processing Layer
โ โโโ Predict (โ
Streaming + Concurrency design)
โ โโโ Streaming (โ Missing GenStage integration)
โ โโโ Variable Integration (๐ง Module selection automation)
โโโ ๐ Evaluation & Optimization Layer
โ โโโ Evaluation (โ
Nx-Powered Analytics)
โ โโโ Teleprompters (๐ SIMBA working, others missing)
โ โโโ Variable System (๐ง Universal optimization framework)
โ โโโ Adaptive Selection (๐ง Multi-objective optimization)
โโโ ๐ Data & Retrieval Layer
โ โโโ Retrieval (โ Missing 20+ backend integration)
โ โโโ Datasets (โ Missing integration)
โ โโโ Utils (โ Missing cross-cutting utilities)
โโโ ๐งช Advanced & Experimental Layer
โโโ Experimental (โ Missing research features)
โโโ Propose (โ Missing proposal generation)
โโโ Variable Extensions (๐ง Future optimization strategies)
Master Development Strategy
๐ฏ Phase 1: Foundation Consolidation (4-6 weeks)
“Stabilize the base before building the tower”
1.1 Critical Behavior Definitions (Week 1)
Priority: ๐ด Blocking - Required for all subsequent work
Scope: Define missing behaviors that enable contract-first development
DSPEx.Client
behavior (CRITICAL - blocks client layer testing)DSPEx.Adapter
behavior (CRITICAL - blocks adapter selection)DSPEx.Evaluator
behavior (HIGH - blocks evaluation system)
Why Now: Without these behaviors, we cannot mock or test the variable system integration points.
Deliverables:
# New behavior definitions
lib/dspex/client.ex # @callback request/2, request/3
lib/dspex/adapter.ex # @callback format_messages/2, parse_response/2
lib/dspex/evaluate.ex # @callback run/4, run_local/4, run_distributed/4
Variable System Impact: Enables mocking of adapter selection and evaluation during variable optimization.
1.2 Critical Missing Integrations (Weeks 2-4)
Priority: ๐ด Tier 1 - Foundation dependencies
RETRIEVE_INTEGRATION.md (Week 2)
- Scope: 20+ retrieval backends, vector databases, embedding systems
- Variable Integration: Retrieval method selection as discrete variable
- Dependencies: ExLLM client integration, Nx embedding operations
- Impact: Enables RAG systems, knowledge base integration
TELEPROMPT_INTEGRATION.md (Week 3)
- Scope: 10+ optimization algorithms beyond SIMBA
- Variable Integration: Optimization strategy selection as discrete variable
- Dependencies: Existing SIMBA foundation, evaluation system
- Impact: Completes teleprompter ecosystem for variable system
STREAMING_INTEGRATION.md (Week 4)
- Scope: GenStage integration, real-time processing, reactive streams
- Variable Integration: Streaming strategy selection, backpressure configuration
- Dependencies: Enhanced Program behavior, GenStage/Flow
- Impact: Enables real-time variable optimization feedback
1.3 Foundation Testing & Validation (Weeks 5-6)
Scope: Comprehensive testing of integrated foundation
- Integration tests across all behaviors
- Performance benchmarking vs. DSPy
- Contract validation with Mox
- Foundation stress testing
๐ฏ Phase 2: Variable System Integration (6-8 weeks)
“Carefully threading the variable system through existing architecture”
2.1 Variable System Architecture Decision (Week 7)
Critical Decision Point: Choose between Cursor vs Claude Code approaches
Evaluation Criteria:
- Implementation Speed: Cursor approach more direct
- Long-term Extensibility: Claude Code approach more modular
- Integration Complexity: How well each threads through existing systems
- Testing Requirements: Which approach enables better testing
Recommended Hybrid Strategy:
# Combine strengths of both approaches
โโโ Variable Abstraction (Cursor: Simple, practical)
โโโ Multi-Objective Evaluation (Claude Code: Sophisticated)
โโโ Universal Optimizer (Cursor: Direct implementation)
โโโ Continuous Learning (Claude Code: Advanced architecture)
2.2 Core Variable System Implementation (Weeks 8-10)
Scope: Implement chosen variable system architecture
Week 8: Variable abstraction and basic optimization
# Core variable system
lib/dspex/variables/variable.ex
lib/dspex/variables/variable_space.ex
lib/dspex/variables/configuration_manager.ex
Week 9: Multi-objective evaluation integration
# Evaluation integration
lib/dspex/variables/evaluators/
โโโ accuracy_evaluator.ex
โโโ cost_evaluator.ex
โโโ latency_evaluator.ex
โโโ reliability_evaluator.ex
Week 10: Universal optimizer implementation
# Optimization strategies
lib/dspex/variables/optimizers/
โโโ universal_optimizer.ex
โโโ bayesian_optimizer.ex
โโโ genetic_optimizer.ex
โโโ grid_search_optimizer.ex
2.3 Integration Point Threading (Weeks 11-12)
Scope: Carefully integrate variable system with existing components
Adapter Integration:
- Thread adapter selection through variable system
- Enable automatic JSON vs Markdown vs Chat selection
- Maintain backward compatibility
Module Integration:
- Enable Predict vs CoT vs PoT automatic selection
- Thread through program behavior system
- Preserve existing module interfaces
Evaluation Integration:
- Connect variable evaluation to existing Nx-powered metrics
- Enable multi-objective optimization
- Maintain evaluation system performance
2.4 Variable System Testing & Validation (Weeks 13-14)
Scope: Comprehensive testing of variable integration
- Unit tests for all variable components
- Integration tests with existing systems
- Property-based testing with StreamData
- Performance benchmarking of optimization
๐ฏ Phase 3: Advanced Integration Completion (4-6 weeks)
“Complete the ecosystem with advanced features”
3.1 Essential Infrastructure (Weeks 15-16)
DATASETS_INTEGRATION.md
- Built-in datasets, preprocessing pipelines
- Variable integration: Dataset selection and preprocessing strategies
UTILS_INTEGRATION.md
- Cross-cutting utilities, telemetry integration
- Variable integration: Utility configuration optimization
3.2 Advanced Features (Weeks 17-18)
EXPERIMENTAL_INTEGRATION.md
- Cutting-edge research features, experimental algorithms
- Variable integration: Experimental strategy selection
PROPOSE_INTEGRATION.md
- Proposal generation systems, advanced reasoning
- Variable integration: Proposal strategy optimization
3.3 System Optimization & Polish (Weeks 19-20)
Scope: Performance optimization and production readiness
- Performance tuning across all integrated systems
- Memory optimization for variable space exploration
- Production deployment strategies
- Documentation completion
Critical Interdependencies & Re-visit Strategy
๐ “Can’t Know Till We Get There” Points
1. Variable System Architecture Impact
Unknown: How variable abstraction affects existing program performance Re-visit Strategy: Implement basic version first, then optimize based on real performance data Triggers: Performance degradation > 10%, memory usage > 2x baseline
2. Multi-Objective Optimization Complexity
Unknown: Real-world performance of Pareto analysis with 4+ objectives Re-visit Strategy: Start with 2 objectives (accuracy, cost), add more based on results Triggers: Optimization time > 5 minutes, convergence issues
3. Elixir Concurrency vs. Python DSPy Performance
Unknown: How concurrent variable optimization compares to Python sequential Re-visit Strategy: Benchmark early and often, adjust concurrency strategies Triggers: Performance not meeting 2x Python DSPy baseline
4. Integration Point Friction
Unknown: How much existing code needs modification for variable integration Re-visit Strategy: Implement adapters/wrappers first, refactor core later if needed Triggers: Breaking changes > 20% of existing API surface
๐ Critical Dependency Chains
Chain 1: Behavior โ Variable Testing
Behavior Definitions โ Mock Implementation โ Variable System Testing โ Integration Validation
Risk: Cannot test variable system without proper behavior mocking Mitigation: Prioritize behavior definitions in Phase 1
Chain 2: Evaluation โ Variable Optimization
Enhanced Evaluation โ Multi-Objective Metrics โ Variable Optimization โ Automatic Selection
Risk: Variable optimization quality depends on evaluation sophistication Mitigation: Ensure evaluation system is robust before variable integration
Chain 3: Streaming โ Real-time Variable Optimization
GenStage Integration โ Streaming Evaluation โ Real-time Variable Adjustment โ Adaptive Systems
Risk: Real-time optimization requires streaming infrastructure Mitigation: Implement streaming early in Phase 1
Quality Assurance Strategy
๐งช Multi-Modal Testing Approach
Testing Per Phase
Phase 1: Foundation testing with comprehensive behavior validation Phase 2: Variable system testing with property-based validation Phase 3: End-to-end integration testing with performance validation
Continuous Validation
# Required checks for each phase
mix test --include group_1 --include group_2 # Comprehensive test suite
mix dialyzer # Zero warnings required
mix credo --strict # Code quality validation
mix format --check-formatted # Code formatting
Performance Benchmarking
- Baseline: Current DSPEx performance metrics
- Target: 2x Python DSPy performance for equivalent operations
- Variable System: Optimization time < 5 minutes for standard problems
๐ Progress Tracking & Success Metrics
Phase 1 Success Criteria
- All critical behaviors defined and implemented
- 3 critical integration documents completed (RETRIEVE, TELEPROMPT, STREAMING)
- Foundation performance maintains baseline
- Comprehensive test coverage > 90%
Phase 2 Success Criteria
- Variable system successfully integrated with existing architecture
- Automatic adapter/module selection working
- Multi-objective optimization functional
- No breaking changes to existing API > 20%
Phase 3 Success Criteria
- All 18 integration components completed
- Variable system optimization performance meets targets
- Production deployment ready
- Documentation complete
Risk Management & Mitigation
๐จ High-Risk Areas
1. Variable System Complexity
Risk: Variable abstraction too complex, impacts performance Mitigation: Start simple, iterate based on real usage Fallback: Implement simpler grid-search-only version first
2. Integration Point Conflicts
Risk: Variable system conflicts with existing architecture Mitigation: Use adapter pattern, maintain backward compatibility Fallback: Implement variable system as optional add-on
3. Performance Degradation
Risk: Comprehensive integration slows down basic operations Mitigation: Profile continuously, optimize hot paths Fallback: Make advanced features opt-in
4. Scope Creep
Risk: Variable system requirements grow beyond original scope Mitigation: Strict scope control, defer advanced features to Phase 4 Fallback: Ship basic variable system, iterate in subsequent releases
Resource Allocation & Timeline
๐ Realistic Timeline (16-20 weeks total)
Phase 1: 4-6 weeks (Foundation)
Phase 2: 6-8 weeks (Variable System)
Phase 3: 4-6 weeks (Advanced Features)
Buffer: 2-4 weeks (Unknown unknowns)
๐ฅ Skill Requirements
Elixir Expertise: Advanced (GenServer, GenStage, OTP patterns) Machine Learning: Intermediate (Optimization algorithms, evaluation metrics) System Architecture: Advanced (Integration patterns, performance optimization) Testing: Advanced (Property-based testing, performance testing)
Success Vision
Upon completion, DSPEx will be:
- Most Advanced Optimization Framework: Automatic discovery of optimal configurations
- Elixir-Native Excellence: Leveraging BEAM VM strengths for superior performance
- Production Ready: Comprehensive testing, documentation, deployment strategies
- Community Leading: Setting new standards for prompt engineering frameworks
The variable system will solve the exact problem identified by the DSPy community while positioning DSPEx as the cutting-edge framework for automated prompt engineering optimization.
This plan of plans serves as our strategic roadmap, designed for careful iteration and re-visiting as we discover the realities of implementation. The layered approach ensures we build on solid foundations while remaining flexible enough to adapt as we learn.