← Back to Gap analysis

DSPEX MISSING COMPONENTS MASTER LIST

Documentation for DSPEX_MISSING_COMPONENTS_MASTER_LIST from the Ds ex repository.

DSPEx Missing Components: Master List

Overview

This document provides a comprehensive catalog of ALL DSPy components that are missing or incomplete in DSPEx. It serves as the definitive reference for understanding the scope of work needed to achieve feature parity with the Python DSPy library.


🚨 CRITICAL BLOCKING ISSUES (Must Fix Immediately)

1. SIMBA Teleprompter - Algorithmic Failures

  • Program Selection Algorithm: Uses fixed scores (0.5) instead of real performance
  • Program Pool Management: Missing top_k_plus_baseline() logic
  • Score Calculation: Missing calc_average_score() function
  • Main Loop Integration: Placeholder logic instead of real algorithm

Impact: SIMBA optimizer completely non-functional Priority: CRITICAL - Fix before any other work


📚 TELEPROMPTERS/OPTIMIZERS

Implemented (3/10)

  • ✅ BootstrapFewShot (Complete)
  • ⚠️ BEACON (Infrastructure only, missing Bayesian optimization)
  • ⚠️ SIMBA (Broken - see critical issues above)

Missing Teleprompters (7/10)

1. MIPROv2 - Multi-prompt Instruction Proposal

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/mipro_optimizer_v2.py Features:

  • Multi-step instruction optimization
  • Automatic prompt proposal and refinement
  • Bayesian optimization for hyperparameters
  • Support for multiple signature optimization
  • Meta-learning components

2. COPRO - Curriculum-based Prompt Optimizer

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/copro_optimizer.py Features:

  • Curriculum learning approach
  • Progressive difficulty increase
  • Adaptive example selection
  • Multi-stage optimization

3. Ensemble

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/ensemble.py Features:

  • Multiple model combination
  • Voting mechanisms
  • Confidence-weighted averaging
  • Diversity-based selection

4. BootstrapFinetune

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/bootstrap_finetune.py Features:

  • Model fine-tuning on generated data
  • Bootstrap data generation
  • Training pipeline integration
  • Model adapter support

5. BootstrapFewShotWithRandomSearch

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/random_search.py Features:

  • Random search optimization
  • Hyperparameter exploration
  • Bootstrap with random sampling

6. BootstrapFewShotWithOptuna

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/teleprompt_optuna.py Features:

  • Optuna-based hyperparameter optimization
  • Advanced search strategies
  • Multi-objective optimization

7. BetterTogether

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/bettertogether.py Features:

  • Multi-agent collaboration
  • Program composition optimization
  • Joint training strategies

8. AvatarOptimizer

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/avatar_optimizer.py Features:

  • Avatar-based optimization
  • Persona-driven prompting
  • Character consistency

9. InferRules

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/infer_rules.py Features:

  • Automatic rule inference
  • Pattern extraction from examples
  • Rule-based program enhancement

10. LabeledFewShot (Vanilla)

Status: ❌ Missing entirely DSPy File: dspy/teleprompt/vanilla.py Features:

  • Simple labeled few-shot learning
  • No optimization, just demonstration
  • Baseline comparison method

🧠 PREDICT MODULES

Implemented (2/15)

  • ✅ Predict (Complete)
  • ⚠️ PredictStructured (Bypasses client system)

Missing Predict Modules (13/15)

1. ChainOfThought

Status: ❌ Missing entirely DSPy File: dspy/predict/chain_of_thought.py Features:

  • Step-by-step reasoning
  • Rationale generation
  • Intermediate thought tracking

2. ChainOfThoughtWithHint

Status: ❌ Missing entirely DSPy File: dspy/predict/chain_of_thought_with_hint.py Features:

  • CoT with external hints
  • Guided reasoning
  • Hint integration

3. ReAct (Reason + Act)

Status: ❌ Missing entirely DSPy File: dspy/predict/react.py Features:

  • Thought-action-observation loops
  • Tool integration
  • Multi-step reasoning
  • Action space definition

4. ProgramOfThought

Status: ❌ Missing entirely DSPy File: dspy/predict/program_of_thought.py Features:

  • Code generation capabilities
  • Code execution environment
  • Result integration
  • Mathematical reasoning

5. MultiChainComparison

Status: ❌ Missing entirely DSPy File: dspy/predict/multi_chain_comparison.py Features:

  • Multiple reasoning chains
  • Comparison mechanisms
  • Best chain selection
  • Confidence scoring

6. BestOfN

Status: ❌ Missing entirely DSPy File: dspy/predict/best_of_n.py Features:

  • Multiple generation sampling
  • Best response selection
  • Quality-based filtering

7. Refine

Status: ❌ Missing entirely DSPy File: dspy/predict/refine.py Features:

  • Iterative refinement
  • Draft-and-revise pattern
  • Quality improvement loops

8. Retry

Status: ❌ Missing entirely DSPy File: dspy/predict/retry.py Features:

  • Automatic retry on failure
  • Backoff strategies
  • Error-specific retry logic
  • Max attempt limits

9. Parallel

Status: ⚠️ Infrastructure exists but missing DSPy interface DSPy File: dspy/predict/parallel.py Features:

  • Standardized parallel execution
  • Result aggregation patterns
  • Error handling across tasks

10. Aggregation

Status: ❌ Missing entirely DSPy File: dspy/predict/aggregation.py Features:

  • Response aggregation strategies
  • Majority voting
  • Weighted combinations

11. CodeAct

Status: ❌ Missing entirely DSPy File: dspy/predict/code_act.py Features:

  • Code-based action execution
  • Programming environment integration
  • Interactive code generation

12. KNN (K-Nearest Neighbors)

Status: ❌ Missing entirely DSPy File: dspy/predict/knn.py Features:

  • K-nearest neighbor retrieval
  • Example-based prediction
  • Similarity-based reasoning

13. Parameter

Status: ❌ Missing entirely DSPy File: dspy/predict/parameter.py Features:

  • Parameter management
  • Learnable parameters
  • Optimization state tracking

🔍 RETRIEVAL SYSTEM

Implemented (0/25)

Status:COMPLETELY MISSING - This is the largest functional gap

Missing Retrieval Components (25/25)

Core Retrieval Framework

  1. DSPEx.Retrieve - Base retrieval behavior
  2. DSPEx.Retrieve.Embeddings - Basic embeddings retrieval

Vector Database Integrations

  1. ChromaDB (dspy/retrieve/chromadb_rm.py)
  2. Pinecone (dspy/retrieve/pinecone_rm.py)
  3. Weaviate (dspy/retrieve/weaviate_rm.py)
  4. Qdrant (dspy/retrieve/qdrant_rm.py)
  5. Milvus (dspy/retrieve/milvus_rm.py)
  6. FAISS (dspy/retrieve/faiss_rm.py)
  7. LanceDB (dspy/retrieve/lancedb_rm.py)
  8. Deeplake (dspy/retrieve/deeplake_rm.py)
  9. Epsilla (dspy/retrieve/epsilla_rm.py)
  10. MyScale (dspy/retrieve/my_scale_rm.py)
  11. MongoDB Atlas (dspy/retrieve/mongodb_atlas_rm.py)
  12. PGVector (dspy/retrieve/pgvector_rm.py)
  13. Neo4j (dspy/retrieve/neo4j_rm.py)
  14. FalkorDB (dspy/retrieve/falkordb_rm.py)

Specialized Retrievers

  1. ColBERTv2 (dspy/dsp/colbertv2.py) - Dense retrieval with late interaction
  2. Azure AI Search (dspy/retrieve/azureaisearch_rm.py)
  3. Databricks (dspy/retrieve/databricks_rm.py)
  4. Clarifai (dspy/retrieve/clarifai_rm.py)
  5. Marqo (dspy/retrieve/marqo_rm.py)
  6. Snowflake (dspy/retrieve/snowflake_rm.py)
  7. Vectara (dspy/retrieve/vectara_rm.py)
  8. Watson Discovery (dspy/retrieve/watson_discovery_rm.py)
  9. You.com (dspy/retrieve/you_rm.py)

Additional Retrieval Features

  • LlamaIndex Integration (dspy/retrieve/llama_index_rm.py)
  • RAGatouille (dspy/retrieve/ragatouille_rm.py)
  • Hybrid Search capabilities
  • Reranking mechanisms

📊 EVALUATION SYSTEM

Implemented (1/10)

  • ⚠️ Basic evaluation framework (limited)

Missing Evaluation Components (9/10)

1. Advanced Metrics

DSPy File: dspy/evaluate/metrics.py

  • Answer Exact Match improvements
  • Answer Passage Match
  • Semantic F1 Score
  • BLEU Score
  • ROUGE Score
  • BERTScore

2. Evaluation Framework Enhancements

DSPy File: dspy/evaluate/evaluate.py

  • Multi-threaded evaluation
  • Progress display
  • Result tables
  • Statistical analysis
  • Error breakdown

3. Auto-Evaluation

DSPy File: dspy/evaluate/auto_evaluation.py

  • LLM-based evaluation
  • Reference-free metrics
  • Quality assessment

4. Specialized Evaluators

  • CompleteAndGrounded
  • Faithfulness metrics
  • Relevance scoring
  • Hallucination detection

🛡️ ASSERTIONS AND CONSTRAINTS

Implemented (0/5)

Status:COMPLETELY MISSING - Core DSPy feature

Missing Assertion Components (5/5)

1. Assertion Framework

DSPy File: dspy/primitives/assertions.py

  • dspy.Assert() - Hard constraints with retry
  • dspy.Suggest() - Soft hints for improvement
  • Context management - Assertion integration
  • Backtracking - Retry with constraints
  • Constraint satisfaction - Runtime validation

🎛️ ADAPTERS AND TYPES

Implemented (2/8)

  • ✅ ChatAdapter (Basic)
  • ✅ JSONAdapter (Basic)

Missing Adapter Components (6/8)

1. Advanced Adapters

  • TwoStepAdapter (dspy/adapters/two_step_adapter.py)
  • Adapter utilities (dspy/adapters/utils.py)

2. Type System

  • Image types (dspy/adapters/types/image.py)
  • Audio types (dspy/adapters/types/audio.py)
  • Tool types (dspy/adapters/types/tool.py)
  • History types (dspy/adapters/types/history.py)

🏗️ PRIMITIVES AND CORE COMPONENTS

Implemented (4/8)

  • ✅ Example (Complete)
  • ✅ Module (Complete)
  • ✅ Program (Complete)
  • ✅ Prediction (Complete)

Missing Primitive Components (4/8)

1. Python Interpreter

DSPy File: dspy/primitives/python_interpreter.py

  • Safe code execution
  • Sandbox environment
  • Result handling

2. Enhanced Module Features

  • Module composition patterns
  • Parameter management
  • State serialization
  • Module introspection

🌐 CLIENT AND MODEL SYSTEM

Implemented (3/20)

  • ✅ Basic Client (Limited providers)
  • ✅ ClientManager (Good)
  • ✅ OpenAI integration (Basic)

Missing Client Components (17/20)

1. Model Providers (Missing 15/20)

DSPy integrates with 100+ models via LiteLLM

  • Anthropic Claude
  • Google Gemini (partial)
  • Cohere
  • Hugging Face
  • Azure OpenAI
  • AWS Bedrock
  • Google Vertex AI
  • Local model support
  • Ollama integration
  • vLLM integration
  • Together AI
  • Anyscale
  • Groq
  • Fireworks AI
  • And 50+ more providers

2. Embedding Models (Missing 5/5)

  • OpenAI embeddings
  • Cohere embeddings
  • Hugging Face embeddings
  • Local embedding models
  • Embedding caching

3. Advanced Client Features (Missing 5/5)

  • Rate limiting (has stub)
  • Circuit breakers (noted as bypassed)
  • Advanced caching
  • Request retries
  • Model fallbacks

🔧 TOOLS AND UTILITIES

Implemented (2/15)

  • ✅ Basic caching
  • ✅ Basic logging

Missing Utility Components (13/15)

1. Advanced Utilities

  • Streaming support (dspy/streaming/)
  • Asyncify utilities (dspy/utils/asyncify.py)
  • Usage tracking (dspy/utils/usage_tracker.py)
  • Saving/loading (dspy/utils/saving.py)
  • History inspection (dspy/utils/inspect_history.py)
  • Parallelizer (dspy/utils/parallelizer.py)
  • Unbatchify (dspy/utils/unbatchify.py)
  • Exception handling (dspy/utils/exceptions.py)

2. Tool Integration

  • LangChain tools (dspy/utils/langchain_tool.py)
  • MCP (Model Context Protocol) (dspy/utils/mcp.py)
  • Python interpreter (dspy/primitives/python_interpreter.py)

3. Observability

  • Advanced telemetry
  • Distributed tracing
  • Performance monitoring
  • Error analytics

🗂️ DATASETS AND DATA HANDLING

Implemented (0/10)

Status: ❌ No dataset utilities

Missing Dataset Components (10/10)

1. Built-in Datasets

DSPy File: dspy/datasets/

  • GSM8K (dspy/datasets/gsm8k.py)
  • HotpotQA (dspy/datasets/hotpotqa.py)
  • Math datasets (dspy/datasets/math.py)
  • Colors dataset (dspy/datasets/colors.py)

2. Data Loading

  • DataLoader (dspy/datasets/dataloader.py)
  • Dataset utilities (dspy/datasets/dataset.py)
  • Example loading patterns

3. Specialized Datasets

  • ALFWorld (dspy/datasets/alfworld/)
  • Custom dataset loaders

📈 EXPERIMENTAL FEATURES

Implemented (0/5)

Status: ❌ No experimental features

Missing Experimental Components (5/5)

1. Advanced Features

DSPy File: dspy/experimental/

  • Module graph analysis (module_graph.py)
  • Synthetic data generation (synthetic_data.py)
  • Synthesizer framework (synthesizer/)

📊 COMPREHENSIVE GAP SUMMARY

CategoryTotal ComponentsImplementedMissingCompletion %
Teleprompters101 (partial)910%
Predict Modules1521313%
Retrieval System250250%
Evaluation101 (partial)910%
Assertions5050%
Adapters/Types82625%
Primitives84450%
Client/Models2031715%
Tools/Utilities1521313%
Datasets100100%
Experimental5050%
TOTAL1311511611%

🎯 CRITICAL PATH FOR FUNCTIONALITY

Phase 1: Fix SIMBA (BLOCKING)

  1. ✅ Program selection algorithm
  2. ✅ Program pool management
  3. ✅ Score calculation logic
  4. ✅ Main loop integration

Phase 2: Core Functionality (HIGH IMPACT)

  1. ✅ ChainOfThought module
  2. ✅ Basic retrieval system
  3. ✅ Assertions framework
  4. ✅ ReAct module

Phase 3: Ecosystem Expansion (MEDIUM IMPACT)

  1. ✅ Vector database integrations (ChromaDB, Pinecone)
  2. ✅ Advanced evaluation metrics
  3. ✅ Additional teleprompters (MIPROv2)
  4. ✅ More predict modules

Phase 4: Production Features (POLISH)

  1. ✅ Advanced caching and observability
  2. ✅ Streaming support
  3. ✅ Additional model providers
  4. ✅ Dataset utilities

🏆 SUCCESS METRICS

  1. SIMBA Works: Fix blocking algorithmic issues
  2. RAG Capability: End-to-end retrieval-augmented generation
  3. Advanced Reasoning: ChainOfThought, ReAct, MultiChain
  4. Production Ready: Robust error handling, monitoring, caching
  5. Ecosystem Parity: 80%+ component coverage compared to DSPy

Current Status: 11% component parity Target Status: 80%+ component parity

This master list shows DSPEx has a solid foundation (excellent infrastructure) but needs substantial work to match DSPy’s comprehensive ecosystem. The most critical path is fixing SIMBA’s algorithmic issues, then building core reasoning modules and the retrieval system.