← Back to Gap analysis

DOCUMENTATION MASTER DEVELOPMENT ROADMAP

Documentation for DOCUMENTATION_MASTER_DEVELOPMENT_ROADMAP from the Ds ex repository.

DSPEx Master Development Roadmap

Executive Summary

This document provides a comprehensive roadmap for DSPEx development based on analysis of 38+ documentation files. The roadmap prioritizes critical fixes, establishes clear development phases, and provides a structured approach to achieving DSPy feature parity while leveraging Elixir’s strengths.

Document Classification & Table of Contents

🚨 Critical Implementation Documents (Primary References)

DocumentPurposePriorityStatus
DSPEX_MISSING_COMPONENTS_MASTER_LIST.mdComplete gap analysis vs DSPyCRITICAL✅ Current
SIMBA_STATUS.mdCurrent SIMBA implementation statusCRITICAL✅ Current
ELIXACT_TODO.mdElixact enhancement requirementsHIGH✅ Current
DSPEX_GAP_ANALYSIS_CORE_*.mdDetailed component analysisHIGH✅ Current
153_copilot_adjusted.mdComprehensive integration planHIGH✅ Current

📊 Analysis & Research Documents (Reference Materials)

DocumentPurposeRelevanceUse Case
145_PYDANTIC_USE_IN_DSPY.mdPydantic usage patterns in DSPyHIGHElixact integration
210_PYDANTIC_INTEG.mdComplete Pydantic integration analysisHIGHElixact requirements
200_dspy_dependency_review.mdDSPy dependency analysisMEDIUMEcosystem planning
DSPEX_LIBS.mdElixir ecosystem analysisMEDIUMLibrary selection

🔧 Technical Implementation Guides

DocumentPurposeStatusApplication
154_testing_integration_validation.mdTesting strategy✅ ReadyImplementation phases
155_implementation_roadmap.mdDetailed implementation guide✅ ReadyAll phases
212_ELIXACT_typeadapter_etc.mdElixact enhancement specs⚠️ Needs updatePhase 2

📋 Foundation & Infrastructure Documents

DocumentPurposeCurrent RelevanceDecision
101_gemini_plan.mdFoundation integration planLOWDeprioritized
111_structured_vs_simba.mdType system analysisMEDIUMReference only
112_foundation_tests.mdFoundation test analysisLOWNot current priority

📚 Historical & Context Documents (Archive)

DocumentStatusReason
100_grok_plan.md - 152_copilot.mdARCHIVEDSuperseded by newer analysis
120_pydantic_diags.mdREFERENCETechnical background only
140_using_elixact.mdREFERENCEEarly exploration

Strategic Assessment: Current State

✅ What’s Working Well

  • Core Infrastructure: DSPEx has solid foundation with Program, Predict, Example
  • Teleprompter Framework: BootstrapFewShot is complete and functional
  • Test Architecture: Comprehensive 3-mode testing (mock/fallback/live)
  • Client System: Multi-provider support with good error handling

🚨 Critical Blocking Issues

  1. SIMBA Algorithm Failure (BLOCKING): Program selection uses fixed scores instead of real performance
  2. Missing Core Modules: ChainOfThought, ReAct, retrieval system completely absent
  3. Type Safety Gap: No structured validation for LLM inputs/outputs
  4. Elixact Integration: Required for Pydantic-style validation and schema generation

Master Implementation Plan

Phase 1: Critical Fixes (Week 1-2) - BLOCKING ISSUES

Priority: CRITICAL - Nothing else works until these are fixed

1.1 Fix SIMBA Algorithm (CRITICAL)

  • Reference: SIMBA_STATUS.md
  • Issue: Program selection algorithm broken (uses 0.5 fixed scores)
  • Files to Fix:
    • lib/dspex/teleprompter/simba.ex - Fix softmax_sample/3
    • Add real score calculation from program performance
    • Implement top_k_plus_baseline/2 program pool management
  • Success Criteria: SIMBA optimization actually improves program performance
  • Timeline: 2-3 days

1.2 Implement Missing Core Predict Modules

  • Reference: DSPEX_MISSING_COMPONENTS_MASTER_LIST.md Section 🧠
  • Priority Order:
    1. ChainOfThought (Most critical - 40% of DSPy usage)
    2. ReAct (Essential for agent applications)
    3. MultiChainComparison (Quality improvement)
  • Timeline: 1 week for all three

Phase 2: Type Safety & Structured Outputs (Week 3-4)

Priority: HIGH - Enables robust LLM integration

2.1 Elixact Integration

  • Reference: ELIXACT_TODO.md, 212_ELIXACT_typeadapter_etc.md
  • Requirements:
    • Runtime schema generation (critical for DSPy patterns)
    • TypeAdapter functionality
    • Enhanced reference resolution
  • Implementation: Either enhance Elixact OR build DSPEx.Schema wrapper
  • Decision Point: Evaluate Elixact enhancement effort vs. custom solution

2.2 Structured Output Pipeline

  • Reference: 153_copilot_adjusted.md Phase 3
  • Components:
    • Enhanced signature system with type validation
    • Automatic JSON schema generation for LLM APIs
    • Structured response parsing and validation
  • Integration: Use Elixact schemas for signature definitions

Phase 3: Ecosystem Expansion (Week 5-8)

Priority: MEDIUM - Builds toward DSPy parity

3.1 Retrieval System

  • Reference: DSPEX_GAP_ANALYSIS_CORE_03.md
  • Status: 0% implemented - largest functional gap
  • Components:
    • Base DSPEx.Retrieve behavior
    • ChromaDB, Pinecone integrations
    • Embeddings support
  • Impact: Enables RAG applications (major DSPy use case)

3.2 Additional Teleprompters

  • Reference: DSPEX_MISSING_COMPONENTS_MASTER_LIST.md
  • Priority:
    1. MIPROv2 (widely used optimizer)
    2. Ensemble (multiple model combination)
    3. COPRO (curriculum learning)

3.3 Evaluation Framework Enhancement

  • Current: Basic evaluation only
  • Add: Semantic similarity, BLEU/ROUGE scores, statistical analysis

Phase 4: Advanced Features (Week 9-12)

Priority: POLISH - Production readiness

4.1 Advanced Reasoning Patterns

  • Components:
    • ProgramOfThought (code generation)
    • Retry and error handling modules
    • Parallel execution patterns

4.2 Production Features

  • Observability: Enhanced telemetry and monitoring
  • Caching: Sophisticated caching strategies
  • Streaming: Real-time response processing

Implementation Strategy

Development Principles

  1. Fix Before Build: Resolve SIMBA before adding new features
  2. Test-Driven: Use comprehensive test suite from 154_testing_integration_validation.md
  3. Incremental: Each phase delivers working functionality
  4. Performance First: Maintain current performance benchmarks

Resource Allocation

  • 60% Effort: Phases 1-2 (critical foundation)
  • 30% Effort: Phase 3 (ecosystem expansion)
  • 10% Effort: Phase 4 (advanced features)

Risk Mitigation

  • Elixact Decision: Evaluate enhancement effort in first week
  • SIMBA Fix: Highest priority - blocks optimization pipeline
  • Backward Compatibility: Maintain existing API throughout

Success Metrics

  • SIMBA demonstrates measurable program improvement
  • ChainOfThought and ReAct modules functional
  • Type-safe signature system operational
  • Basic retrieval system enables RAG workflows
  • 60%+ DSPy component parity achieved

Next Steps (Immediate Actions)

Week 1 Priorities

  1. Day 1-2: Fix SIMBA program selection algorithm
  2. Day 3-4: Implement ChainOfThought module
  3. Day 5: Evaluate Elixact enhancement requirements vs. custom schema solution

Decision Points

  • Elixact vs. Custom: Based on enhancement complexity analysis
  • Retrieval Priority: ChromaDB vs. Pinecone for first implementation
  • Teleprompter Order: Based on SIMBA fix success

Success Criteria for Week 1

  • SIMBA optimizer shows measurable performance improvement
  • ChainOfThought module passes basic reasoning tests
  • Clear decision on Elixact integration approach
  • Realistic timeline for Phase 2 established

Long-Term Vision (Months 2-3)

Component Parity Target

  • Goal: 80% of DSPy functionality
  • Current: 11% (per gap analysis)
  • Trajectory: 15% improvement per month with focused effort

Unique Elixir Advantages

  • Concurrency: Massive parallel evaluation (10,000+ concurrent)
  • Fault Tolerance: OTP supervision for long-running optimization
  • Hot Code Reloading: Runtime algorithm updates
  • Distribution: Multi-node optimization scaling

Community Impact

  • Documentation: Comprehensive migration guides
  • Examples: Real-world use cases and tutorials
  • Ecosystem: Integration with broader Elixir AI libraries

This roadmap transforms the scattered documentation into a focused, executable plan that addresses critical issues first while building toward comprehensive DSPy parity.