Of course. This is an excellent architectural question. Integrating elixact
into DSPEx
is not just a good idea—it’s a transformative one that would align DSPEx
more closely with the robust, declarative design principles of the original Python DSPy
library.
By analyzing how DSPy
leverages pydantic
, we can create a clear and actionable integration plan for DSPEx
using elixact
. This plan will focus on the modules you have already built out, such as Signature
, Predict
, and the SIMBA
teleprompter.
Executive Summary: Why Elixact is a Game-Changer for DSPEx
The core of DSPy’s power comes from using pydantic
to create structured, self-describing, and automatically validated data models for its signatures. Your current implementation cleverly parses a string to achieve this, but elixact
offers a more idiomatic, powerful, and maintainable Elixir-native solution.
Key Benefits of Integration:
- Declarative Power: Move from string parsing (
"question -> answer"
) to a declarativeschema do ... end
block, making signatures more readable, explicit, and powerful. - Type Safety & Validation: Gain automatic, compile-time and runtime validation of inputs, outputs, and examples against the defined signature schema.
- Automatic JSON Schema: Eliminate manual JSON schema generation in adapters.
elixact
can auto-generate JSON schemas from your signatures, which is crucial for structured outputs with models like Gemini (viaPredictStructured
). - Robustness: Reduce runtime errors by catching data shape mismatches early, making teleprompters like
SIMBA
more reliable as they manipulate and generate examples.
High-Level Feature Mapping: Pydantic in DSPy vs. Elixact in DSPEx
Pydantic Feature (in DSPy) | Elixact Equivalent | Integration Point in DSPEx |
---|---|---|
dspy.Signature as pydantic.BaseModel | Module with use Elixact and schema block | dspex/signature.ex : The core of the refactoring. |
InputField , OutputField as pydantic.Field | field :name, :type, ... macro | dspex/signature.ex : Field metadata like description is built-in. |
JSON Schema Generation (.model_json_schema() ) | Elixact.JsonSchema.from_schema/1 | dspex/predict_structured.ex & dspex/adapters/instructor_lite_gemini.ex . |
Data Validation (.model_validate() ) | MySchema.validate/1 and validate!/1 | dspex/predict.ex , dspex/example.ex , and teleprompters. |
Structured Errors (ValidationError ) | Elixact.ValidationError struct | Error handling across all modules that perform validation. |
Detailed Integration Plan (File-by-File)
Here is a concrete plan for integrating elixact
into your existing DSPEx
codebase.
1. File: dspex/signature.ex
This is the most critical and impactful change. We will refactor the use DSPEx.Signature
macro to be based on elixact
.
Current State: A signature is defined by parsing a string at compile time.
defmodule QASignature do
use DSPEx.Signature, "question -> answer"
end
DSPy Parallel (dspy/signatures/signature.py
):
Signatures inherit from pydantic.BaseModel
. Fields are defined using type hints and InputField
/OutputField
which are wrappers around pydantic.Field
.
class QASignature(dspy.Signature):
"""Answer questions with short factoid answers."""
question = dspy.InputField()
answer = dspy.OutputField()
Proposed Integration with elixact
:
Refactor DSPEx.Signature
to be a behavior and encourage users to use Elixact
directly. DSPEx will then work with any module that implements this pattern. The concept of InputField
and OutputField
can be handled by elixact
’s metadata or a custom macro. For simplicity, we can start by convention (e.g., last field is output). A more robust solution is a custom macro.
Let’s define a new DSPEx.Schema
that wraps Elixact
.
New dspex/schema.ex
(Wrapper around Elixact):
defmodule DSPEx.Schema do
defmacro __using__(_) do
quote do
use Elixact
import DSPEx.Schema
end
end
defmacro input_field(name, type, opts \\ []) do
quote do
# Add a custom metadata key to the field
opts = Keyword.put(opts, :__dspy_field_type, :input)
Elixact.Schema.field(unquote(name), unquote(type), unquote(opts))
end
end
defmacro output_field(name, type, opts \\ []) do
quote do
opts = Keyword.put(opts, :__dspy_field_type, :output)
Elixact.Schema.field(unquote(name), unquote(type), unquote(opts))
end
end
end
Updated dspex/signature.ex
:
The DSPEx.Signature
module can now be refactored to work with these new elixact
-based schemas, providing helper functions to extract input/output fields based on the __dspy_field_type
metadata.
New Usage (defmodule MySignature
):
# before
defmodule QASignature do
use DSPEx.Signature, "question -> answer"
end
# after
defmodule QASignature do
use DSPEx.Schema # Our new wrapper
@moduledoc "Answer questions with short factoid answers."
schema @moduledoc do
input_field :question, :string, description: "The question to be answered."
output_field :answer, :string, description: "A short, factoid answer."
end
end
This makes the signature definition far more robust, descriptive, and powerful.
2. Files: dspex/predict_structured.ex
& dspex/adapters/instructor_lite_gemini.ex
This is where elixact
’s JSON Schema generation shines.
Current State:
The instructor_lite_gemini
adapter manually constructs a JSON schema from the signature’s field list. This is brittle and limited.
# in dspex/adapters/instructor_lite_gemini.ex
defp build_json_schema(signature) do
output_fields = signature.output_fields()
properties =
output_fields
|> Enum.map(fn field -> {to_string(field), build_field_schema(field)} end)
|> Enum.into(%{})
# ... manual schema construction ...
end
DSPy Parallel (dspy/adapters/json_adapter.py
):
The adapter dynamically creates a pydantic
model from the signature’s output fields and then calls .model_json_schema()
to get a fully-featured JSON schema.
Proposed Integration with elixact
:
Modify the adapter to accept an elixact
-based signature module and use Elixact.JsonSchema.from_schema/1
to generate the schema automatically.
Refactored dspex/adapters/instructor_lite_gemini.ex
:
defmodule DSPEx.Adapters.InstructorLiteGemini do
# ...
def format_messages(signature, demos, inputs) do
# ...
# The signature module IS the response model.
response_model = signature
# No more manual building!
{:ok, json_schema} = Elixact.JsonSchema.from_schema(signature)
# ...
instructor_opts = [
response_model: response_model, # Pass the module itself
json_schema: json_schema,
# ...
]
{:ok, {params, instructor_opts}}
end
def parse_response(signature, instructor_result) do
case instructor_result do
# InstructorLite can now return validated data directly
{:ok, validated_data} when is_map(validated_data) ->
{:ok, validated_data}
{:error, %Elixact.ValidationError{errors: errors}} ->
{:error, {:validation_failed, errors}}
# ... other error handling
end
end
# ... build_json_schema and build_response_model are no longer needed
end
This change makes the adapter simpler, more powerful (it supports nested schemas, complex types, and constraints for free), and less prone to errors.
3. File: dspex/predict.ex
The Predict
module will now interact with the new elixact
-based signatures.
Current State:
It calls signature.input_fields()
and signature.output_fields()
and performs manual validation.
Proposed Integration with elixact
:
Predict
will now use MySignature.validate/1
for input validation. It will also extract input/output fields by inspecting the schema’s metadata.
Refactored dspex/predict.ex
:
defmodule DSPEx.Predict do
# ... (struct definition remains similar)
@impl DSPEx.Program
def forward(program, inputs, opts) do
# ...
# Step 1: Input Validation using Elixact
input_fields = # logic to get input fields from signature schema
input_data = Map.take(inputs, input_fields)
case program.signature.validate(input_data) do
{:ok, _validated_inputs} ->
# Proceed with formatting messages and making the request
# ...
{:error, errors} ->
{:error, {:invalid_inputs, errors}}
end
end
end
The helper functions in DSPEx.Signature
would be updated to read the field metadata from the elixact
schema instead of the old module attributes.
4. File: dspex/example.ex
The Example
struct can be enhanced to be schema-aware.
Current State:
An Example
struct holds a data
map and a input_keys
set to distinguish inputs from outputs.
Proposed Integration with elixact
:
An Example
can be validated against a specific signature schema. The input_keys
set becomes redundant because the signature itself defines which keys are inputs and outputs.
Refactored dspex/example.ex
:
defmodule DSPEx.Example do
# ... struct definition might remain the same for flexibility
@doc "Validates an example against a given signature schema."
@spec validate(t(), module()) :: :ok | {:error, [Elixact.Error.t()]}
def validate(%__MODULE__{data: data}, signature_schema) do
signature_schema.validate(data)
end
@doc "Returns inputs based on the signature schema, not a stored key set."
@spec inputs(t(), module()) :: map()
def inputs(%__MODULE__{data: data}, signature_schema) do
input_field_names =
signature_schema.__schema__(:fields)
|> Enum.filter(fn {_name, meta} -> meta.__dspy_field_type == :input end)
|> Enum.map(fn {name, _meta} -> name end)
Map.take(data, input_field_names)
end
# outputs/1 can be implemented similarly.
end
This change makes the data handling in teleprompters much more robust, as every generated example can be instantly validated against the program’s signature.
Implementation Roadmap
This integration can be done in phases to minimize disruption.
Phase 1: Core Signature Refactoring (Highest Priority)
- Create
DSPEx.Schema
: Implement the wrapper aroundelixact
withinput_field
/output_field
macros. - Refactor
DSPEx.Signature
: Change it to be a simple behaviour and provide helper functions that inspectelixact
schemas. - Update a few test signatures (e.g.,
QASignature
) to use the newuse DSPEx.Schema
pattern. - Update
DSPEx.Predict
to work with the new signature modules.
Phase 2: Leverage Structured Output
- Refactor
DSPEx.Adapters.InstructorLiteGemini
to useElixact.JsonSchema.from_schema/1
. - Update
DSPEx.PredictStructured
to pass the signature module directly to the adapter. - Add comprehensive tests for structured prediction with complex, nested schemas to verify the power of the new approach.
Phase 3: Enhance Data Handling
- Refactor
DSPEx.Example
to be schema-aware, removing the need for theinput_keys
set. - Update Teleprompters (
SIMBA
,BootstrapFewShot
): Modify the parts of the code that create or manipulateDSPEx.Example
instances to use the new schema-awareinputs/2
andoutputs/2
functions. Add validation steps where new examples are created.
Phase 4: Configuration and Polish
- Use
elixact
schemas to define and validate configuration for teleprompters and other modules. - Update all documentation and examples to reflect the new, more powerful signature definition pattern.
By following this plan, DSPEx
will not only be a faithful port of DSPy
’s functionality but will also be an exemplar of robust, modern Elixir design.