Of course. This is a complex but critical task. Synthesizing all the architectural documents, implementation plans, and strategic analyses into a single, coherent roadmap is the key to successfully executing this vision.
Given the scope, we will build this roadmap incrementally. This first prompt will focus on establishing the High-Level Vision, Phasing, and the detailed plan for Phase 1.
ElixirML/DSPex: The Complete Implementation Roadmap
1. Vision & Strategic Goal
Vision: To create a revolutionary AI/ML development platform in Elixir that combines the declarative programming model of dspy
with the concurrency, fault tolerance, and real-time capabilities of the BEAM. The platform will be production-ready, featuring a native Elixir core for orchestration and performance, while strategically bridging to Python for access to its mature data science ecosystem.
End Goal: ElixirML
will not be a simple port of dspy
. It will be a superior platform for building, optimizing, and deploying complex, multi-agent AI systems, establishing Elixir as a first-class language for AI orchestration. The DSPex
library will be the user-facing interface to this platform.
Core Differentiators:
- Universal Variable System: A powerful,
DSPex-only
innovation for optimizing any parameter across the entire system. - OTP-Native Reliability: Leveraging supervisors and GenServers for fault-tolerant, concurrent modules and pipelines.
- Hybrid Execution Model: The strategic use of both native Elixir modules and a Python bridge for the best of both worlds.
- Integrated MLOps: Built-in support for evaluation, deployment, and monitoring via the Ash framework.
2. High-Level Phasing
The project will be executed in four major phases, each delivering a significant and usable milestone.
Phase 1: The Ash-Bridged Foundation (Months 1-2)
- Goal: Establish a working system that uses Ash resources to manage
dspy
programs executed via a robust Python bridge. Deliver a functional, if not fully native, experience. - Outcome: A deployable application where users can define programs in Elixir, execute them against a Python backend, and manage the lifecycle through a GraphQL API.
- Goal: Establish a working system that uses Ash resources to manage
Phase 2: The Native Execution Core (Months 3-4)
- Goal: Begin the strategic, incremental port of
dspy
to native Elixir, focusing on the most performance-critical and concurrency-friendly components. - Outcome: A hybrid system where core prediction and reasoning modules run natively in Elixir, providing significant performance and reliability gains for common workloads.
- Goal: Begin the strategic, incremental port of
Phase 3: Native Optimization & Tooling (Months 5-6)
- Goal: Port key optimizers and build out the native tool-use and RAG capabilities, further reducing reliance on Python for core workflows.
- Outcome: A mostly-native platform that can handle complex RAG and ReAct patterns. The Python bridge transitions from a primary backend to a specialized tool for advanced optimizers.
Phase 4: Enterprise-Ready Platform (Months 7-9)
- Goal: Build out the advanced MLOps, multi-model orchestration, and deployment automation features.
- Outcome: A complete, enterprise-grade platform for building, deploying, and managing sophisticated AI systems.
3. Detailed Roadmap: Phase 1 - The Ash-Bridged Foundation
Objective: To create a robust foundation that delivers immediate value by wrapping the Python dspy
library within a production-ready Elixir/Ash framework. This phase prioritizes a stable bridge, a powerful management interface (Ash), and a great developer experience (native signatures).
Month 1: The Core Bridge and Signature System
Week | Epic | Key Tasks & Implementation Details | Success Criteria |
---|---|---|---|
Week 1 | (1.1) Solidify Testing & Bridge Infrastructure | Based on TESTING_INFRASTRUCTURE_FIX_PLAN.md :• Implement all event-driven test helpers ( SupervisionTestHelpers , BridgeTestHelpers ).• Refactor all Process.sleep calls in the existing codebase to use the new helpers.• Implement the 3-Layer Testing configuration ( mix test.fast , etc.).• Finalize dspy_bridge.py : Add robust error handling, command dispatching for create/execute/delete , and a graceful shutdown command. | ✅ All 220+ tests pass reliably. ✅ mix test.all executes successfully across all 3 layers.✅ Zero Process.sleep calls remain in the codebase. |
Week 2 | (1.2) Native Signature System | Based on stage1_01_signature_system.md :• DSPex.Signature : Implement the use DSPex.Signature macro and signature DSL.• Compiler : Write the @before_compile hook to parse the AST.• TypeParser : Implement parsing for all basic, ML, and composite types.• Validator : Create the runtime validation logic. | ✅ The native DSL (signature ... -> ... ) compiles.✅ validate_inputs/1 and validate_outputs/1 functions are generated and work correctly for all types. |
Week 3 | (1.3) Exdantic Integration & Schema Generation | Based on stage1_05_type_system.md :• Integrate Exdantic into the DSPex.Signature.Validator for more robust validation and error messages.• Implement the DSPex.Signature.JsonSchema module to generate OpenAI and Anthropic-compatible schemas from native signatures.• Create the TypeConverter : Centralize logic for converting between Elixir types, Python type strings, and JSON Schema. | ✅ MySignature.validate_inputs(%{...}) returns Pydantic-style detailed errors.✅ MySignature.to_json_schema(:openai) produces a valid function-calling schema. |
Week 4 | (1.4) Adapter Pattern & Core Ash Resources | Based on stage1_03b_adapter_infrastructure.md & stage1_04_ash_resources.md :• Define Adapter behaviour .• Implement PythonPort adapter: This adapter will use the TypeConverter and call the PythonBridge .• Implement Mock adapter: A simple, in-memory GenServer for fast testing.• Ash Domain and Signature Resource: Create the DSPex.ML.Domain and the DSPex.ML.Signature Ash resource for persisting compiled signature definitions. | ✅ DSPex.Adapters.PythonPort.create_program/1 successfully calls the Python bridge.✅ DSPex.ML.Signature.create!(...) persists a signature to the database. |
Month 2: The Ash Application Layer
Week | Epic | Key Tasks & Implementation Details | Success Criteria |
---|---|---|---|
Week 5 | (1.5) Program & Execution Resources | Based on stage1_04_ash_resources.md :• Implement Ash Program Resource: Include belongs_to :signature , status tracking, and configuration attributes.• Implement Ash Execution Resource: Include belongs_to :program , inputs, outputs, status, and performance metrics. Use AshStateMachine . | ✅ A Program can be created in Ash, linked to a Signature .✅ An Execution record can be created, linked to a Program . |
Week 6 | (1.6) Custom Ash Data Layer | This is the most complex task of Phase 1. Based on STAGE_2_CORE_OPERATIONS.md (pulled forward):• Create DSPex.DataLayer module.• Implement the run_query/3 callback.• For the :execute action on the Program resource, the data layer should:1. Load the program’s signature. 2. Validate the action’s inputs against the signature. 3. Call the PythonPort adapter to execute the program.4. Create an Execution record with the result. | ✅ DSPex.ML.Program.execute(program, %{...}) successfully runs a program in Python and persists the Execution record in Postgres, all in one call. This is the MVP end-to-end flow. |
Week 7 | (1.7) APIs & Background Jobs | Based on STAGE_3_PRODUCTION_FEATURES.md (pulled forward):• Add the AshGraphQL extension to the DSPex.ML.Domain .• Expose create_program and execute_program mutations.• Add AshOban extension.• Implement OptimizationJob Resource: A simple resource to track long-running optimizations.• Create OptimizationWorker : An Oban worker that calls an :optimize action on the Python bridge. | ✅ Can execute a program via a GraphQL mutation. ✅ Can trigger a (mocked) optimization task that runs as a background job. |
Week 8 | (1.8) Final Integration & Polish | • Write Comprehensive Integration Tests: Create tests that cover the full GraphQL -> Ash -> Data Layer -> Adapter -> Bridge -> Python -> Bridge -> Ash -> GraphQL round trip. • Documentation: Write “Getting Started” guides and tutorials for the Phase 1 feature set. • Refactor & Cleanup: Address any tech debt accumulated during the phase. | ✅ A complete end-to-end integration test passes reliably. ✅ A developer can follow a README to set up the project and run their first program in under 15 minutes. |
This completes the detailed plan for Phase 1. The outcome will be a highly valuable and functional system that delivers on the core promise of a better developer experience for dspy
in Elixir, all backed by a production-ready Ash framework. This sets a solid, stable foundation for the more ambitious native implementations in subsequent phases.
Shall we proceed to detail Phase 2?