← Back to Gem

20250713 p1of4

Documentation for 20250713_p1of4 from the Dspex repository.

Of course. This is a complex but critical task. Synthesizing all the architectural documents, implementation plans, and strategic analyses into a single, coherent roadmap is the key to successfully executing this vision.

Given the scope, we will build this roadmap incrementally. This first prompt will focus on establishing the High-Level Vision, Phasing, and the detailed plan for Phase 1.


ElixirML/DSPex: The Complete Implementation Roadmap

1. Vision & Strategic Goal

Vision: To create a revolutionary AI/ML development platform in Elixir that combines the declarative programming model of dspy with the concurrency, fault tolerance, and real-time capabilities of the BEAM. The platform will be production-ready, featuring a native Elixir core for orchestration and performance, while strategically bridging to Python for access to its mature data science ecosystem.

End Goal: ElixirML will not be a simple port of dspy. It will be a superior platform for building, optimizing, and deploying complex, multi-agent AI systems, establishing Elixir as a first-class language for AI orchestration. The DSPex library will be the user-facing interface to this platform.

Core Differentiators:

  1. Universal Variable System: A powerful, DSPex-only innovation for optimizing any parameter across the entire system.
  2. OTP-Native Reliability: Leveraging supervisors and GenServers for fault-tolerant, concurrent modules and pipelines.
  3. Hybrid Execution Model: The strategic use of both native Elixir modules and a Python bridge for the best of both worlds.
  4. Integrated MLOps: Built-in support for evaluation, deployment, and monitoring via the Ash framework.

2. High-Level Phasing

The project will be executed in four major phases, each delivering a significant and usable milestone.

  • Phase 1: The Ash-Bridged Foundation (Months 1-2)

    • Goal: Establish a working system that uses Ash resources to manage dspy programs executed via a robust Python bridge. Deliver a functional, if not fully native, experience.
    • Outcome: A deployable application where users can define programs in Elixir, execute them against a Python backend, and manage the lifecycle through a GraphQL API.
  • Phase 2: The Native Execution Core (Months 3-4)

    • Goal: Begin the strategic, incremental port of dspy to native Elixir, focusing on the most performance-critical and concurrency-friendly components.
    • Outcome: A hybrid system where core prediction and reasoning modules run natively in Elixir, providing significant performance and reliability gains for common workloads.
  • Phase 3: Native Optimization & Tooling (Months 5-6)

    • Goal: Port key optimizers and build out the native tool-use and RAG capabilities, further reducing reliance on Python for core workflows.
    • Outcome: A mostly-native platform that can handle complex RAG and ReAct patterns. The Python bridge transitions from a primary backend to a specialized tool for advanced optimizers.
  • Phase 4: Enterprise-Ready Platform (Months 7-9)

    • Goal: Build out the advanced MLOps, multi-model orchestration, and deployment automation features.
    • Outcome: A complete, enterprise-grade platform for building, deploying, and managing sophisticated AI systems.

3. Detailed Roadmap: Phase 1 - The Ash-Bridged Foundation

Objective: To create a robust foundation that delivers immediate value by wrapping the Python dspy library within a production-ready Elixir/Ash framework. This phase prioritizes a stable bridge, a powerful management interface (Ash), and a great developer experience (native signatures).

Month 1: The Core Bridge and Signature System

WeekEpicKey Tasks & Implementation DetailsSuccess Criteria
Week 1(1.1) Solidify Testing & Bridge InfrastructureBased on TESTING_INFRASTRUCTURE_FIX_PLAN.md:
• Implement all event-driven test helpers (SupervisionTestHelpers, BridgeTestHelpers).
• Refactor all Process.sleep calls in the existing codebase to use the new helpers.
• Implement the 3-Layer Testing configuration (mix test.fast, etc.).
Finalize dspy_bridge.py: Add robust error handling, command dispatching for create/execute/delete, and a graceful shutdown command.
✅ All 220+ tests pass reliably.
mix test.all executes successfully across all 3 layers.
✅ Zero Process.sleep calls remain in the codebase.
Week 2(1.2) Native Signature SystemBased on stage1_01_signature_system.md:
DSPex.Signature: Implement the use DSPex.Signature macro and signature DSL.
Compiler: Write the @before_compile hook to parse the AST.
TypeParser: Implement parsing for all basic, ML, and composite types.
Validator: Create the runtime validation logic.
✅ The native DSL (signature ... -> ...) compiles.
validate_inputs/1 and validate_outputs/1 functions are generated and work correctly for all types.
Week 3(1.3) Exdantic Integration & Schema GenerationBased on stage1_05_type_system.md:
• Integrate Exdantic into the DSPex.Signature.Validator for more robust validation and error messages.
• Implement the DSPex.Signature.JsonSchema module to generate OpenAI and Anthropic-compatible schemas from native signatures.
Create the TypeConverter: Centralize logic for converting between Elixir types, Python type strings, and JSON Schema.
MySignature.validate_inputs(%{...}) returns Pydantic-style detailed errors.
MySignature.to_json_schema(:openai) produces a valid function-calling schema.
Week 4(1.4) Adapter Pattern & Core Ash ResourcesBased on stage1_03b_adapter_infrastructure.md & stage1_04_ash_resources.md:
Define Adapter behaviour.
Implement PythonPort adapter: This adapter will use the TypeConverter and call the PythonBridge.
Implement Mock adapter: A simple, in-memory GenServer for fast testing.
Ash Domain and Signature Resource: Create the DSPex.ML.Domain and the DSPex.ML.Signature Ash resource for persisting compiled signature definitions.
DSPex.Adapters.PythonPort.create_program/1 successfully calls the Python bridge.
DSPex.ML.Signature.create!(...) persists a signature to the database.

Month 2: The Ash Application Layer

WeekEpicKey Tasks & Implementation DetailsSuccess Criteria
Week 5(1.5) Program & Execution ResourcesBased on stage1_04_ash_resources.md:
Implement Ash Program Resource: Include belongs_to :signature, status tracking, and configuration attributes.
Implement Ash Execution Resource: Include belongs_to :program, inputs, outputs, status, and performance metrics. Use AshStateMachine.
✅ A Program can be created in Ash, linked to a Signature.
✅ An Execution record can be created, linked to a Program.
Week 6(1.6) Custom Ash Data LayerThis is the most complex task of Phase 1.
Based on STAGE_2_CORE_OPERATIONS.md (pulled forward):
• Create DSPex.DataLayer module.
• Implement the run_query/3 callback.
• For the :execute action on the Program resource, the data layer should:
1. Load the program’s signature.
2. Validate the action’s inputs against the signature.
3. Call the PythonPort adapter to execute the program.
4. Create an Execution record with the result.
DSPex.ML.Program.execute(program, %{...}) successfully runs a program in Python and persists the Execution record in Postgres, all in one call. This is the MVP end-to-end flow.
Week 7(1.7) APIs & Background JobsBased on STAGE_3_PRODUCTION_FEATURES.md (pulled forward):
• Add the AshGraphQL extension to the DSPex.ML.Domain.
• Expose create_program and execute_program mutations.
• Add AshOban extension.
Implement OptimizationJob Resource: A simple resource to track long-running optimizations.
Create OptimizationWorker: An Oban worker that calls an :optimize action on the Python bridge.
✅ Can execute a program via a GraphQL mutation.
✅ Can trigger a (mocked) optimization task that runs as a background job.
Week 8(1.8) Final Integration & PolishWrite Comprehensive Integration Tests: Create tests that cover the full GraphQL -> Ash -> Data Layer -> Adapter -> Bridge -> Python -> Bridge -> Ash -> GraphQL round trip.
Documentation: Write “Getting Started” guides and tutorials for the Phase 1 feature set.
Refactor & Cleanup: Address any tech debt accumulated during the phase.
✅ A complete end-to-end integration test passes reliably.
✅ A developer can follow a README to set up the project and run their first program in under 15 minutes.

This completes the detailed plan for Phase 1. The outcome will be a highly valuable and functional system that delivers on the core promise of a better developer experience for dspy in Elixir, all backed by a production-ready Ash framework. This sets a solid, stable foundation for the more ambitious native implementations in subsequent phases.

Shall we proceed to detail Phase 2?