Excellent. Based on the comprehensive codebase and the provided context that Elixact
has been significantly upgraded to meet the needs of a modern Pydantic
-like library, here is the official technical integration plan. This document supersedes any previous integration plans and outlines a clear path to fully leverage Elixact within the DSPEx framework.
Elixact Integration: A Detailed Implementation Plan for DSPEx
1. Executive Summary
This document outlines the technical strategy and implementation plan for integrating the enhanced Elixact
library into the DSPEx
framework. The primary goal is to replace the existing string-based signature parsing and manual validation with Elixact
’s robust, declarative schema system. This will transform DSPEx
into a type-safe, self-documenting, and more powerful framework, mirroring the relationship between Python’s DSPy
and Pydantic
.
The integration is designed to be incremental, ensuring backward compatibility while progressively introducing the benefits of Elixact
across the entire codebase, from core signatures to configuration and teleprompters.
2. Analysis of Elixact Integration Points
The DSPEx
codebase contains several key areas where Elixact
will provide immediate and significant value.
2.1. Core Signature System (DSPEx.Signature
)
- Current State:
use DSPEx.Signature, "question -> answer"
parses a string at compile time. It’s functional but limited in expressiveness, metadata, and type safety. TheEnhancedParser
adds types and constraints but is a bespoke solution. - Integration Point: The
use DSPEx.Signature
macro will be refactored to useElixact
as its foundation. This will allow for rich, declarative schema definitions. - Benefit: Moves from a fragile string-based contract to a robust, type-safe, and self-documenting schema that supports complex types, constraints, and metadata (descriptions, examples).
2.2. Structured Prediction (DSPEx.PredictStructured
& Adapters)
- Current State: The
DSPEx.Adapters.InstructorLiteGemini
module manually constructs a JSON schema for the LLM. This is error-prone, hard to maintain, and does not support complex types or constraints defined in the signature. - Integration Point: The adapter will be modified to automatically generate a JSON schema from an
Elixact
-based signature module usingElixact.JsonSchema.from_schema/1
. - Benefit: Enables fully-featured structured outputs. Any type or constraint defined in the
Elixact
schema (nested objects, arrays, string patterns, numeric ranges) will be automatically reflected in the JSON schema passed to the LLM, dramatically increasing reliability and capability.
2.3. Program Input/Output Validation (DSPEx.Predict
)
- Current State: Input validation is a basic check for key presence. Output validation is minimal, relying on the adapter’s parsing.
- Integration Point:
DSPEx.Predict.forward/3
will useElixact.Validator
to validate incominginputs
and parsedoutputs
against theElixact
schema. - Benefit: Guarantees that data flowing through a program adheres to the signature’s contract at every step, catching errors early and ensuring data integrity. Enables type coercion for LLM outputs (e.g., converting a “42” string to an integer
42
).
2.4. Data Representation (DSPEx.Example
)
- Current State: The
DSPEx.Example
struct uses an explicitinput_keys
set to distinguish inputs from outputs. - Integration Point:
DSPEx.Example
can be enhanced to be schema-aware. The distinction between inputs and outputs can be derived directly from the associated signature schema. - Benefit: Simplifies the
Example
struct and makes data handling more robust. All examples used in teleprompters can be validated against the signature, preventing “garbage-in, garbage-out” optimization cycles.
2.5. Configuration System (DSPEx.Config
)
- Current State:
DSPEx
has a sophisticated, independent configuration system that already usesElixact
schemas for validation (dspex/config/elixact_schemas.ex
). - Integration Point: This system serves as a model for the rest of the framework. The integration will unify the validation approach, using
Elixact
for both application configuration and program data contracts. - Benefit: Creates a consistent, declarative validation strategy across the entire library.
3. The Elixact
Integration Contract & Bridge
A new bridge module will formalize the interaction between DSPEx
and Elixact
.
File: dspex/signature/elixact.ex
(This file will be enhanced)
DSPEx.Signature.Elixact
Responsibilities:
to_runtime_schema(signature_module, opts \\ [])
:- Input: A compiled
DSPEx.Signature
module. - Process: Reads the
__enhanced_fields__
parsed byEnhancedParser
, maps them to the format required byElixact.Runtime.create_schema/2
, and returns a dynamic, validatableElixact
schema. The result will be cached to prevent re-computation. - Output:
{:ok, Elixact.Runtime.DynamicSchema.t()} | {:error, term()}
.
- Input: A compiled
validate(signature_module, data, opts \\ [])
:- Input: A signature module, a data map, and options like
:field_type
(:inputs
or:outputs
) and:config
(anElixact.Config
struct). - Process: Gets the runtime schema via
to_runtime_schema/2
. If:field_type
is specified, it creates a partial schema on-the-fly. It then invokesElixact.EnhancedValidator.validate/3
to perform the validation. - Output:
{:ok, validated_data} | {:error, [Elixact.Error.t()]}
.
- Input: A signature module, a data map, and options like
to_json_schema(signature_module, opts \\ [])
:- Input: A signature module and options (e.g.,
field_type: :outputs
,provider: :openai
). - Process: Gets the runtime schema, generates the base JSON schema, and then uses
Elixact.JsonSchema.Resolver
to optimize it for a specific LLM provider if requested. - Output:
{:ok, json_schema_map} | {:error, term()}
.
- Input: A signature module and options (e.g.,
4. Detailed Implementation Plan
This plan is phased to ensure a smooth transition with full backward compatibility.
Phase 1: Foundation & Core Signature Refactoring (1-2 Weeks)
Enhance
DSPEx.Signature.Elixact
Bridge:- Task: Implement the
to_runtime_schema/2
,validate/3
, andto_json_schema/2
functions as defined in the contract above. - File:
dspex/signature/elixact.ex
.
- Task: Implement the
Refactor
use DSPEx.Signature
Macro:- Task: Modify the
__using__/1
macro indspex/signature.ex
. - Logic: When
use DSPEx.Signature
is called, it will now:- Parse the signature string using
EnhancedParser
. - Store the parsed fields in a module attribute (
@enhanced_fields
). - Generate
input_fields/0
,output_fields/0
, andinstructions/0
functions for backward compatibility. - Generate
validate/1
andjson_schema/0
functions that delegate to theDSPEx.Signature.Elixact
bridge.
- Parse the signature string using
- File:
dspex/signature.ex
.
- Task: Modify the
Create Custom DSPEx Types:
- Task: Create a new
dspex/types.ex
module to define common, reusableElixact.Type
s for DSPEx. - Initial Types:
DSPEx.Types.ReasoningChain
: A string with min/max length and format constraints.DSPEx.Types.ConfidenceScore
: A float between 0.0 and 1.0.
- File:
dspex/types.ex
(New).
- Task: Create a new
Phase 2: Structured Prediction and Client Integration (1 Week)
Refactor
InstructorLiteGemini
Adapter:- Task: Replace the manual
build_json_schema/1
function with a call toDSPEx.Signature.Elixact.to_json_schema(signature, field_type: :outputs, provider: :gemini)
. - Task: In
parse_response/2
, useDSPEx.Signature.Elixact.validate/3
to validate the structured data returned from InstructorLite. This will provide coercion and constraint checking on the LLM’s output. - Files:
dspex/adapters/instructor_lite_gemini.ex
,dspex/predict_structured.ex
.
- Task: Replace the manual
Integrate Validation into
DSPEx.Predict
:- Task: In
DSPEx.Predict.forward/3
, add calls toprogram.signature.validate(data)
for both inputs (before the LLM call) and outputs (after the LLM call). - Configuration: Add an option to
DSPEx.Predict.new/3
likevalidate: :strict | :lenient | :none
to control this behavior. Default to:lenient
. - File:
dspex/predict.ex
.
- Task: In
Phase 3: Teleprompter & Data Integrity (1 Week)
Make
DSPEx.Example
Schema-Aware:- Task: Add an optional
:signature
field to theDSPEx.Example
struct. When present,inputs/1
andoutputs/1
will derive their keys from the schema instead ofinput_keys
. - Task: Add
DSPEx.Example.validate/1
which validates the example’s data against its associated signature schema. - File:
dspex/example.ex
.
- Task: Add an optional
Enhance Teleprompters with Validation:
- Task: In
BootstrapFewShot.compile/5
, after generating demonstration candidates, useDSPEx.Example.validate/1
to ensure they conform to the teacher’s signature before evaluation. - Task: In
SIMBA.apply_strategies_to_buckets/7
, whenAppendDemo
creates a newExample
, validate it against the source program’s signature. - Files:
dspex/teleprompter/bootstrap_fewshot.ex
,dspex/teleprompter/simba/strategy/append_demo.ex
.
- Task: In
Phase 4: Documentation and Final Polish (1 Week)
Update All Documentation:
- Task: Update module and function documentation across the codebase to reflect the new
Elixact
-based signature definition and validation capabilities. - Task: Create a
MIGRATING.md
guide explaining how to upgrade from string-based signatures toElixact
schemas.
- Task: Update module and function documentation across the codebase to reflect the new
Add New Examples:
- Task: Create new examples in the
dspex/teleprompter/beacon/examples.ex
file (and others) that showcase defining and using complex, nested signatures with constraints.
- Task: Create new examples in the
Final Review:
- Task: Perform a full code review to ensure consistency, remove old/dead code related to manual validation, and confirm all tests are passing.
5. Risk Assessment and Mitigation
- Risk: Performance overhead from increased validation.
- Mitigation:
Elixact
is designed to be performant. Validation can be made optional via configuration for performance-critical paths. We will benchmark key workflows.
- Mitigation:
- Risk: Complexity for existing users.
- Mitigation: The phased rollout with a backward compatibility layer ensures existing code continues to work. The
SignatureCompat
module will handle old signatures automatically. Clear documentation and migration tools will ease the transition.
- Mitigation: The phased rollout with a backward compatibility layer ensures existing code continues to work. The
- Risk:
Elixact
library bugs.- Mitigation: The library is being developed in tandem and has a comprehensive test suite. Any issues discovered during integration will be fixed in
elixact
immediately.
- Mitigation: The library is being developed in tandem and has a comprehensive test suite. Any issues discovered during integration will be fixed in
6. Conclusion
This integration plan provides a clear, phased, and low-risk path to fundamentally upgrading the DSPEx
framework. By replacing bespoke solutions with the robust, feature-rich Elixact
library, DSPEx
will become more powerful, reliable, and easier to maintain. This strategic move aligns DSPEx
with the best practices of modern data-driven applications and solidifies its position as a premier framework for building self-improving AI systems in Elixir.