← Back to 20250617

150 elixact integration overview

Documentation for 150_elixact_integration_overview from the Ds ex repository.

Of course. This is an excellent architectural question. Integrating elixact into DSPEx is not just a good idea—it’s a transformative one that would align DSPEx more closely with the robust, declarative design principles of the original Python DSPy library.

By analyzing how DSPy leverages pydantic, we can create a clear and actionable integration plan for DSPEx using elixact. This plan will focus on the modules you have already built out, such as Signature, Predict, and the SIMBA teleprompter.

Executive Summary: Why Elixact is a Game-Changer for DSPEx

The core of DSPy’s power comes from using pydantic to create structured, self-describing, and automatically validated data models for its signatures. Your current implementation cleverly parses a string to achieve this, but elixact offers a more idiomatic, powerful, and maintainable Elixir-native solution.

Key Benefits of Integration:

  1. Declarative Power: Move from string parsing ("question -> answer") to a declarative schema do ... end block, making signatures more readable, explicit, and powerful.
  2. Type Safety & Validation: Gain automatic, compile-time and runtime validation of inputs, outputs, and examples against the defined signature schema.
  3. Automatic JSON Schema: Eliminate manual JSON schema generation in adapters. elixact can auto-generate JSON schemas from your signatures, which is crucial for structured outputs with models like Gemini (via PredictStructured).
  4. Robustness: Reduce runtime errors by catching data shape mismatches early, making teleprompters like SIMBA more reliable as they manipulate and generate examples.

High-Level Feature Mapping: Pydantic in DSPy vs. Elixact in DSPEx

Pydantic Feature (in DSPy)Elixact EquivalentIntegration Point in DSPEx
dspy.Signature as pydantic.BaseModelModule with use Elixact and schema blockdspex/signature.ex: The core of the refactoring.
InputField, OutputField as pydantic.Fieldfield :name, :type, ... macrodspex/signature.ex: Field metadata like description is built-in.
JSON Schema Generation (.model_json_schema())Elixact.JsonSchema.from_schema/1dspex/predict_structured.ex & dspex/adapters/instructor_lite_gemini.ex.
Data Validation (.model_validate())MySchema.validate/1 and validate!/1dspex/predict.ex, dspex/example.ex, and teleprompters.
Structured Errors (ValidationError)Elixact.ValidationError structError handling across all modules that perform validation.

Detailed Integration Plan (File-by-File)

Here is a concrete plan for integrating elixact into your existing DSPEx codebase.

1. File: dspex/signature.ex

This is the most critical and impactful change. We will refactor the use DSPEx.Signature macro to be based on elixact.

Current State: A signature is defined by parsing a string at compile time.

defmodule QASignature do
  use DSPEx.Signature, "question -> answer"
end

DSPy Parallel (dspy/signatures/signature.py): Signatures inherit from pydantic.BaseModel. Fields are defined using type hints and InputField/OutputField which are wrappers around pydantic.Field.

class QASignature(dspy.Signature):
    """Answer questions with short factoid answers."""
    question = dspy.InputField()
    answer = dspy.OutputField()

Proposed Integration with elixact: Refactor DSPEx.Signature to be a behavior and encourage users to use Elixact directly. DSPEx will then work with any module that implements this pattern. The concept of InputField and OutputField can be handled by elixact’s metadata or a custom macro. For simplicity, we can start by convention (e.g., last field is output). A more robust solution is a custom macro.

Let’s define a new DSPEx.Schema that wraps Elixact.

New dspex/schema.ex (Wrapper around Elixact):

defmodule DSPEx.Schema do
  defmacro __using__(_) do
    quote do
      use Elixact
      import DSPEx.Schema
    end
  end

  defmacro input_field(name, type, opts \\ []) do
    quote do
      # Add a custom metadata key to the field
      opts = Keyword.put(opts, :__dspy_field_type, :input)
      Elixact.Schema.field(unquote(name), unquote(type), unquote(opts))
    end
  end

  defmacro output_field(name, type, opts \\ []) do
    quote do
      opts = Keyword.put(opts, :__dspy_field_type, :output)
      Elixact.Schema.field(unquote(name), unquote(type), unquote(opts))
    end
  end
end

Updated dspex/signature.ex: The DSPEx.Signature module can now be refactored to work with these new elixact-based schemas, providing helper functions to extract input/output fields based on the __dspy_field_type metadata.

New Usage (defmodule MySignature):

# before
defmodule QASignature do
  use DSPEx.Signature, "question -> answer"
end

# after
defmodule QASignature do
  use DSPEx.Schema # Our new wrapper

  @moduledoc "Answer questions with short factoid answers."
  schema @moduledoc do
    input_field :question, :string, description: "The question to be answered."
    output_field :answer, :string, description: "A short, factoid answer."
  end
end

This makes the signature definition far more robust, descriptive, and powerful.

2. Files: dspex/predict_structured.ex & dspex/adapters/instructor_lite_gemini.ex

This is where elixact’s JSON Schema generation shines.

Current State: The instructor_lite_gemini adapter manually constructs a JSON schema from the signature’s field list. This is brittle and limited.

# in dspex/adapters/instructor_lite_gemini.ex
defp build_json_schema(signature) do
  output_fields = signature.output_fields()
  properties =
    output_fields
    |> Enum.map(fn field -> {to_string(field), build_field_schema(field)} end)
    |> Enum.into(%{})
  # ... manual schema construction ...
end

DSPy Parallel (dspy/adapters/json_adapter.py): The adapter dynamically creates a pydantic model from the signature’s output fields and then calls .model_json_schema() to get a fully-featured JSON schema.

Proposed Integration with elixact: Modify the adapter to accept an elixact-based signature module and use Elixact.JsonSchema.from_schema/1 to generate the schema automatically.

Refactored dspex/adapters/instructor_lite_gemini.ex:

defmodule DSPEx.Adapters.InstructorLiteGemini do
  # ...

  def format_messages(signature, demos, inputs) do
    # ...
    # The signature module IS the response model.
    response_model = signature

    # No more manual building!
    {:ok, json_schema} = Elixact.JsonSchema.from_schema(signature)
    # ...

    instructor_opts = [
      response_model: response_model, # Pass the module itself
      json_schema: json_schema,
      # ...
    ]

    {:ok, {params, instructor_opts}}
  end

  def parse_response(signature, instructor_result) do
    case instructor_result do
      # InstructorLite can now return validated data directly
      {:ok, validated_data} when is_map(validated_data) ->
        {:ok, validated_data}

      {:error, %Elixact.ValidationError{errors: errors}} ->
        {:error, {:validation_failed, errors}}

      # ... other error handling
    end
  end

  # ... build_json_schema and build_response_model are no longer needed
end

This change makes the adapter simpler, more powerful (it supports nested schemas, complex types, and constraints for free), and less prone to errors.

3. File: dspex/predict.ex

The Predict module will now interact with the new elixact-based signatures.

Current State: It calls signature.input_fields() and signature.output_fields() and performs manual validation.

Proposed Integration with elixact: Predict will now use MySignature.validate/1 for input validation. It will also extract input/output fields by inspecting the schema’s metadata.

Refactored dspex/predict.ex:

defmodule DSPEx.Predict do
  # ... (struct definition remains similar)

  @impl DSPEx.Program
  def forward(program, inputs, opts) do
    # ...
    # Step 1: Input Validation using Elixact
    input_fields = # logic to get input fields from signature schema
    input_data = Map.take(inputs, input_fields)

    case program.signature.validate(input_data) do
      {:ok, _validated_inputs} ->
        # Proceed with formatting messages and making the request
        # ...

      {:error, errors} ->
        {:error, {:invalid_inputs, errors}}
    end
  end
end

The helper functions in DSPEx.Signature would be updated to read the field metadata from the elixact schema instead of the old module attributes.

4. File: dspex/example.ex

The Example struct can be enhanced to be schema-aware.

Current State: An Example struct holds a data map and a input_keys set to distinguish inputs from outputs.

Proposed Integration with elixact: An Example can be validated against a specific signature schema. The input_keys set becomes redundant because the signature itself defines which keys are inputs and outputs.

Refactored dspex/example.ex:

defmodule DSPEx.Example do
  # ... struct definition might remain the same for flexibility

  @doc "Validates an example against a given signature schema."
  @spec validate(t(), module()) :: :ok | {:error, [Elixact.Error.t()]}
  def validate(%__MODULE__{data: data}, signature_schema) do
    signature_schema.validate(data)
  end

  @doc "Returns inputs based on the signature schema, not a stored key set."
  @spec inputs(t(), module()) :: map()
  def inputs(%__MODULE__{data: data}, signature_schema) do
    input_field_names =
      signature_schema.__schema__(:fields)
      |> Enum.filter(fn {_name, meta} -> meta.__dspy_field_type == :input end)
      |> Enum.map(fn {name, _meta} -> name end)

    Map.take(data, input_field_names)
  end

  # outputs/1 can be implemented similarly.
end

This change makes the data handling in teleprompters much more robust, as every generated example can be instantly validated against the program’s signature.


Implementation Roadmap

This integration can be done in phases to minimize disruption.

Phase 1: Core Signature Refactoring (Highest Priority)

  1. Create DSPEx.Schema: Implement the wrapper around elixact with input_field/output_field macros.
  2. Refactor DSPEx.Signature: Change it to be a simple behaviour and provide helper functions that inspect elixact schemas.
  3. Update a few test signatures (e.g., QASignature) to use the new use DSPEx.Schema pattern.
  4. Update DSPEx.Predict to work with the new signature modules.

Phase 2: Leverage Structured Output

  1. Refactor DSPEx.Adapters.InstructorLiteGemini to use Elixact.JsonSchema.from_schema/1.
  2. Update DSPEx.PredictStructured to pass the signature module directly to the adapter.
  3. Add comprehensive tests for structured prediction with complex, nested schemas to verify the power of the new approach.

Phase 3: Enhance Data Handling

  1. Refactor DSPEx.Example to be schema-aware, removing the need for the input_keys set.
  2. Update Teleprompters (SIMBA, BootstrapFewShot): Modify the parts of the code that create or manipulate DSPEx.Example instances to use the new schema-aware inputs/2 and outputs/2 functions. Add validation steps where new examples are created.

Phase 4: Configuration and Polish

  1. Use elixact schemas to define and validate configuration for teleprompters and other modules.
  2. Update all documentation and examples to reflect the new, more powerful signature definition pattern.

By following this plan, DSPEx will not only be a faithful port of DSPy’s functionality but will also be an exemplar of robust, modern Elixir design.