DSPex V2 Architecture
Overview
DSPex V2 is a complete rewrite that creates a native Elixir DSPy implementation while using Snakepit for Python DSPy integration. The architecture enables:
- Gradual Native Implementation: Start with Python DSPy via Snakepit, gradually add native Elixir implementations
- Mixed Execution: Mix and match native/Python implementations in the same pipeline
- First-Class Python Processes: Python processes are integrated as equals, not second-class citizens
- Smart Routing: Automatically choose the best implementation based on availability and performance
Key Design Decisions
1. Snakepit Integration
- Snakepit manages all Python process pooling
- DSPex focuses on the DSPy API and routing logic
- Clean separation of concerns
2. Native-First Where It Makes Sense
- Signatures: Always native (fast parsing, no Python overhead)
- Templates: Native EEx implementation
- Validators: Native for simple validations
- LLM Clients: Adapter pattern with InstructorLite, HTTP, or Python backends
- Complex ML: Delegate to Python (e.g., ColBERTv2, miprov2)
3. Protocol-Agnostic Bridge
- Support multiple serialization formats (JSON, MessagePack, Arrow)
- Extensible for future protocols
- Efficient data transfer
Architecture Components
Core Modules
DSPex - Public API module
- Clean, intuitive interface
- Hides implementation details
- Delegates to appropriate subsystems
DSPex.Router - Smart routing engine
- Tracks available implementations
- Routes to native or Python based on capability
- Collects performance metrics for optimization
DSPex.Pipeline - Workflow orchestration
- Sequential, parallel, conditional execution
- Mix native and Python steps seamlessly
- Streaming support (when available)
DSPex.Native.* - Native implementations
- Signature - DSPy signature parsing
- Template - EEx-based templating
- Validator - Data validation
- Metrics - Performance tracking
- LMClient - Adapter-based LLM integration
DSPex.Python.* - Python bridge
- Bridge - Snakepit integration
- Registry - Track Python modules
- PoolManager - Lifecycle management
Data Flow
User Request
↓
DSPex API
↓
Router (decides native vs Python)
↓
Native Module ←→ Python Bridge
↓
Snakepit
↓
Python DSPy
Pipeline Example
pipeline = DSPex.pipeline([
{:native, Signature, spec: "query -> keywords: list[str]"},
{:python, "dspy.ChainOfThought", signature: "keywords -> analysis"},
{:parallel, [
{:native, Search, index: "docs"},
{:python, "dspy.ColBERTv2", k: 10}
]},
{:native, Template, template: "Results: <%= @results %>"}
])
{:ok, result} = DSPex.run_pipeline(pipeline, %{query: "explain DSPy"})
Testing Strategy
Three-layer testing architecture:
Layer 1: Mock Adapter (~70ms)
- Unit tests with mocked Python responses
- Fast feedback during development
Layer 2: Bridge Mock
- Protocol testing without full Python
- Validates serialization/deserialization
Layer 3: Full Integration
- Complete end-to-end testing
- Requires Python environment
Next Steps
Python Environment Setup
- Install DSPy in Python environment
- Create bridge scripts for DSPy modules
- Test end-to-end integration
LLM Adapter Implementation
- Add InstructorLite to dependencies
- Implement InstructorLite adapter
- Create HTTP adapter for direct API calls
- Test adapter switching and configuration
Additional Native Modules
- Implement more DSPy modules natively
- Focus on performance-critical operations
- Maintain API compatibility
Performance Optimization
- Profile router decisions
- Optimize serialization
- Add caching where beneficial
Advanced Features
- Streaming support (pending Snakepit implementation)
- Distributed execution
- Model management
Development Commands
# Run tests by layer
mix test.fast # Layer 1: Mock adapter
mix test.protocol # Layer 2: Bridge mock
mix test.integration # Layer 3: Full integration
mix test.all # All layers sequentially
# Check code quality
mix dialyzer # Type checking
mix format # Code formatting
mix credo # Static analysis
# Development
iex -S mix # Interactive shell
Configuration
# config/config.exs
config :dspex,
router: [
prefer_native: true,
fallback_to_python: true
]
config :snakepit,
python_path: "python3",
pool_size: 4
Status
✅ Core architecture implemented ✅ Native signature parsing ✅ Router with smart delegation ✅ Pipeline orchestration ✅ Snakepit integration ✅ Clean compilation ✅ Dialyzer passing
🚧 Python DSPy scripts 🚧 End-to-end testing 🚧 Performance benchmarks 🚧 Documentation