← Back to Python adapter generalization

Generalization requirements

Documentation for generalization-requirements from the Dspex repository.

Generalization Requirements

Overview

To support multiple ML frameworks beyond DSPy, the DSPex Python adapter needs specific generalizations while maintaining its robust infrastructure. This document outlines the requirements for creating a framework-agnostic bridge system.

Core Requirements

1. Framework Independence

Current State: Tightly coupled to DSPy concepts (signatures, programs, LM configuration)

Required Changes:

  • Abstract framework-specific operations into pluggable modules
  • Define generic resource management interface (create, execute, delete)
  • Support different initialization patterns per framework

Success Criteria:

  • Can add new ML framework without modifying core infrastructure
  • Each framework maintains its native concepts and APIs
  • No DSPy dependencies in core bridge code

2. Pluggable Architecture

Python Side Requirements:

  • Base bridge class handling protocol and communication
  • Framework-specific bridges inherit base functionality
  • Dynamic command handler registration
  • Framework capability discovery

Elixir Side Requirements:

  • Generic adapter behaviour for common operations
  • Framework-specific adapters for specialized features
  • Bridge registry for managing multiple frameworks
  • Runtime bridge selection

3. Configuration Management

Requirements:

  • Centralized bridge configuration
  • Per-bridge Python script paths
  • Required package validation
  • Pool configuration per bridge
  • Environment variable support

Example Configuration:

config :dspex, :ml_bridges,
  default: :dspy,
  bridges: [
    dspy: %{
      adapter: DSPex.Adapters.DSPyAdapter,
      python_script: "priv/python/dspy_bridge.py",
      pool_size: 4,
      required_env: ["GEMINI_API_KEY"]
    },
    langchain: %{
      adapter: DSPex.Adapters.LangChainAdapter,
      python_script: "priv/python/langchain_bridge.py",
      pool_size: 2,
      required_env: ["OPENAI_API_KEY"]
    }
  ]

4. Unified Interface

Requirements:

  • Common interface for basic ML operations
  • Framework-specific extensions when needed
  • Type safety for framework-specific features
  • Backward compatibility with existing DSPy code

Interface Design:

# Generic operations
MLBridge.create_resource(bridge, type, config)
MLBridge.execute_resource(bridge, resource_id, inputs)
MLBridge.list_resources(bridge, type)
MLBridge.delete_resource(bridge, resource_id)

# Framework-specific access
{:ok, adapter} = MLBridge.get_adapter(:langchain)
LangChainAdapter.create_chain(adapter, chain_config)

5. Protocol Extensions

Requirements:

  • Maintain existing wire protocol
  • Support framework-specific command namespacing
  • Resource type abstraction
  • Streaming response support (for LLMs)

Command Namespacing:

{
  "command": "framework:operation",
  "examples": [
    "common:ping",
    "common:get_stats",
    "dspy:create_program",
    "langchain:create_chain",
    "custom:train_model"
  ]
}

Use Case Requirements

1. LangChain Integration

Specific Needs:

  • Chain and agent management
  • Multiple LLM provider support
  • Tool/function calling
  • Memory and context management
  • Streaming responses

Resource Types:

  • Chains
  • Agents
  • Tools
  • Memory stores
  • Prompts

2. Hugging Face Transformers

Specific Needs:

  • Model loading and caching
  • Tokenizer management
  • Batch processing
  • GPU resource management
  • Model fine-tuning

Resource Types:

  • Models
  • Tokenizers
  • Pipelines
  • Datasets
  • Training runs

3. Custom ML Frameworks

Specific Needs:

  • Arbitrary Python code execution
  • Custom resource definitions
  • State management
  • Long-running operations
  • Progress callbacks

Resource Types:

  • User-defined resources
  • Custom operations
  • State objects

Technical Requirements

1. Performance

  • Maintain current performance characteristics
  • Pool startup time < 5 seconds
  • Request latency < 10ms overhead
  • Support concurrent operations per framework
  • Efficient resource cleanup

2. Error Handling

  • Framework-specific error translation
  • Graceful degradation on framework errors
  • Clear error messages for missing dependencies
  • Recovery strategies per framework

3. Monitoring

  • Per-framework metrics
  • Resource usage tracking
  • Framework-specific health checks
  • Performance profiling hooks

4. Testing

  • Framework mock capabilities
  • Integration test helpers per framework
  • Performance benchmarks
  • Chaos engineering support

Migration Requirements

1. Backward Compatibility

  • Existing DSPy code must continue working
  • Gradual migration path
  • Deprecation warnings for old APIs
  • Documentation for migration

2. Incremental Adoption

  • Can use new architecture alongside old
  • Per-module migration
  • Feature flags for new functionality
  • Rollback capabilities

3. Documentation

  • Migration guide
  • Framework integration guide
  • API reference
  • Example implementations

Security Requirements

1. Code Execution

  • Sandboxed Python execution options
  • Resource limits per framework
  • Timeout enforcement
  • Memory limits

2. Dependency Management

  • Package verification
  • Version pinning
  • Security scanning
  • Isolated environments

3. Access Control

  • Per-framework permissions
  • Resource access control
  • API key management
  • Audit logging

Success Metrics

1. Extensibility

  • Time to add new framework < 1 day
  • No core code changes for new frameworks
  • Clear integration points

2. Performance

  • No regression in existing benchmarks
  • Pool overhead < 5% for any framework
  • Startup time comparable across frameworks

3. Adoption

  • Easy migration from current code
  • Clear documentation and examples
  • Community contributions

4. Reliability

  • Maintain current error rates
  • Framework isolation (one framework’s errors don’t affect others)
  • Graceful degradation