Excellent. We will now generate complementary documentation that builds directly upon the architectural analysis and diagrams. This documentation is designed for a hypothetical developer looking to understand and use DSPEx
.
The documentation will be structured in three parts:
- Getting Started Guide: A hands-on tutorial for building and running a simple
DSPEx
program. - Core Concepts in Detail: An in-depth explanation of the primary abstractions (
Program
,Signature
,Client
). - Leveraging the BEAM: A guide to the advanced features
DSPEx
offers thanks to OTP, such as concurrency and fault tolerance.
DSPEx
Documentation
This documentation provides a guide to using DSPEx
, the Elixir port of DSPy, designed to build robust, concurrent, and self-optimizing LLM systems on the BEAM.
Part 1: Getting Started Guide
This guide will walk you through building your first DSPEx
program. We’ll create a simple question-answering program.
Step 1: Project Setup
First, add DSPEx
and an HTTP client like Tesla
to your mix.exs
dependencies:
# mix.exs
def deps do
[
{:dspex, "~> 0.1.0"},
{:tesla, "~> 1.4"},
{:jason, "~> 1.4"} # For JSON processing
]
end
Run mix deps.get
to install them.
Step 2: Configuration
Configure your foundation model provider. In your config/config.exs
, add your API key. DSPEx
uses the application environment to manage configuration securely.
# config/config.exs
import Config
config :dspex, :openai,
api_key: System.get_env("OPENAI_API_KEY"),
organization: System.get_env("OPENAI_ORGANIZATION_ID")
Step 3: Define the Signature
A Signature defines the inputs and outputs of your task. We’ll use the defsignature
macro, which turns a simple string into a structured, compile-time module.
# lib/my_app/signatures.ex
defmodule MyApp.Signatures.SimpleQA do
@moduledoc """
Answers a question given some context.
"""
use DSPEx.Signature, "context, question -> answer"
end
Behind the scenes, this macro creates a MyApp.Signatures.SimpleQA
module with a struct %MyApp.Signatures.SimpleQA{}
and sets the docstring as the high-level instructions for the language model. This corresponds to Diagram 4: Metaprogramming Flow.
Step 4: Start the LM Client
DSPEx
clients are long-running, stateful processes (GenServer
s) that manage API connections. You need to add them to your application’s supervision tree. This ensures they are started, managed, and restarted automatically by OTP.
# lib/my_app/application.ex
defmodule MyApp.Application do
use Application
@impl true
def start(_type, _args) do
children = [
# Start the LM client pool under the main supervisor.
# This manages connections and API keys for OpenAI models.
{DSPEx.Client.LM, name: MyApp.OpenAI_Client, provider: :openai}
]
opts = [strategy: :one_for_one, name: MyApp.Supervisor]
Supervisor.start_link(children, opts)
end
end
This step is a practical implementation of the architecture shown in Diagram 1, where the App Supervisor
manages the LM Client
.
Step 5: Create and Run the Program
Now, let’s use a built-in DSPEx.Predict
module to run our task. Predict
is a pre-built module that takes a signature and executes a single call to the LLM.
# lib/my_app.ex
defmodule MyApp do
alias MyApp.Signatures.SimpleQA
def run_qa do
# Create a Predictor program with our signature.
# We tell it which running client process to use.
qa_program = %DSPEx.Predict{
signature: SimpleQA,
client: MyApp.OpenAI_Client
}
# Execute the program's forward pass.
# This will spawn a process tree as shown in Diagram 2.
result = DSPEx.Program.forward(qa_program, %{
question: "What is the primary benefit of the BEAM virtual machine?",
context: "The BEAM, Erlang's virtual machine, is designed for building concurrent, fault-tolerant, and distributed systems. Its key features include lightweight processes, preemptive scheduling, and a 'let it crash' philosophy with supervisors."
})
# The result is a DSPEx.Prediction struct.
IO.puts("Answer: #{result.answer}")
end
end
To run this, start an IEx session with iex -S mix
and call MyApp.run_qa()
.
Part 2: Core Concepts in Detail
This section expands on the key abstractions that form the foundation of DSPEx
.
DSPEx.Program
(The Behaviour)
A Program
is any module that implements the DSPEx.Program
behaviour. This behaviour mandates a single callback: forward/2
, which takes the module and an input map.
This allows for powerful composition. Let’s build a simple RAG program:
defmodule MyApp.RAG do
@behaviour DSPEx.Program
# Assume these are defined elsewhere
alias MyApp.Retrieve
alias MyApp.GenerateAnswer
defstruct [:retriever, :generator]
@impl true
def forward(program, inputs) do
# 1. Retrieve context
retrieved_passages = DSPEx.Program.forward(program.retriever, %{query: inputs.question})
# 2. Generate answer using the retrieved context
final_inputs = Map.merge(inputs, %{context: retrieved_passages.passages})
DSPEx.Program.forward(program.generator, final_inputs)
end
end
Here, MyApp.RAG
composes two other programs (retriever
and generator
), orchestrating the flow of data between them.
DSPEx.Signature
(The Macro)
As seen before, defsignature
is the primary way to define the I/O contract for a Program
. You can add more detail to your fields to provide better instructions to the LLM.
defmodule MyApp.Signatures.DetailedQA do
@moduledoc "You are a helpful AI assistant..."
use DSPEx.Signature,
"""
context, question -> answer
""",
context: [
desc: "Factual passages that may contain the answer.",
type: {:list, :string} # Type hints for clarity and potential validation
],
question: [
desc: "A specific question about the context."
],
answer: [
desc: "A concise answer, typically 1-2 sentences."
]
end
This compile-time transformation ensures that your program’s data structures are well-defined and efficient at runtime, as illustrated in Diagram 4.
DSPEx.Client
(The GenServer)
Clients are the bridge to external services. They are designed as OTP GenServer
s for several reasons:
- State Management: They hold configuration like API keys and base URLs in their process state, isolated from the rest of the application.
- Concurrency Control: They can manage a pool of connections or a request queue to handle rate limiting gracefully.
- Supervision: As part of the application’s supervision tree, they are automatically restarted if they crash, ensuring service availability.
When your program calls DSPEx.Program.forward
, which in turn calls a client, the call is actually a message to this GenServer
process. The GenServer
then spawns a separate, short-lived Task
to handle the blocking I/O (the HTTP request), freeing the client to accept other requests and preventing the I/O from blocking a critical part of your system. This is the pattern shown in Diagram 2.
Part 3: Leveraging the BEAM
This is where DSPEx
truly shines compared to its Python counterpart.
Massively Parallel Evaluation
In DSPy, evaluating a program on a dataset uses a thread pool. In DSPEx
, we use OTP’s lightweight processes, which are orders of magnitude more efficient.
The DSPEx.Evaluate.run/3
function leverages Task.async_stream
to achieve this:
# dev_set is a list of DSPEx.Example structs.
# MyMetric.calculate/2 is your evaluation function.
# my_program is a module implementing the DSPEx.Program behaviour.
scores = DSPEx.Evaluate.run(my_program, dev_set, &MyMetric.calculate/2, max_concurrency: 100)
How it works (under the hood):
DSPEx.Evaluate.run
takes your program and the list of examples.- It uses
Task.async_stream
to map over thedev_set
. - For each example in the
dev_set
,Task.async_stream
spawns a new, completely isolated process. - The BEAM scheduler runs these processes concurrently across all available CPU cores. If an LLM call involves a 1-second wait, the BEAM can manage tens of thousands of these “waiting” processes with minimal overhead.
- If one evaluation process fails (e.g., due to a malformed example or a non-transient API error), it does not affect any other evaluation.
This architecture provides superior throughput and isolation for large-scale evaluations and optimizations, as visualized in the “Evaluation Worker Pool” in Diagram 3.
Resilience and “Let It Crash”
Consider a network hiccup during an LLM API call.
- In traditional approaches: You would wrap the call in a
try/catch
block with complex retry logic, cluttering your core program. - In
DSPEx
: TheTask
process making the HTTP request is allowed to crash. TheProgramProcess
that spawned it (see Diagram 2) is linked and can be configured to trap exits. When it receives the exit signal, its logic (not the LLM client’s logic) can decide to:- Log the error.
- Retry the task immediately.
- Retry the task after an exponential backoff.
- Give up and return an error for that specific request.
This keeps the core business logic clean and delegates failure handling to dedicated supervisors, a cornerstone of OTP design.
Stateful Optimizers (Teleprompter
)
Optimizers like MIPROv2
or GRPO
are long-running, stateful jobs. In DSPEx
, these are robust GenServer
s.
- State: The
GenServer
holds the state of the optimization (e.g., candidate prompts, scores, generation counts). This state is protected from concurrent access. - Work: For each optimization step, the
GenServer
spawns thousands of evaluationTask
s in parallel to score candidate programs (as seen in Diagram 3). - Resilience: If the node running the
DSPEx
application were to restart, the main application supervisor would restart theTeleprompter
GenServer
. With persistence logic added (e.g., periodically saving state to disk), the optimization could even resume from where it left off.