03 gemini

Documentation for 03_gemini from the Ds ex repository.

Excellent. We will now generate complementary documentation that builds directly upon the architectural analysis and diagrams. This documentation is designed for a hypothetical developer looking to understand and use DSPEx.

The documentation will be structured in three parts:

Getting Started Guide: A hands-on tutorial for building and running a simple DSPEx program.
Core Concepts in Detail: An in-depth explanation of the primary abstractions (Program, Signature, Client).
Leveraging the BEAM: A guide to the advanced features DSPEx offers thanks to OTP, such as concurrency and fault tolerance.

`DSPEx` Documentation

This documentation provides a guide to using DSPEx, the Elixir port of DSPy, designed to build robust, concurrent, and self-optimizing LLM systems on the BEAM.

Part 1: Getting Started Guide

This guide will walk you through building your first DSPEx program. We’ll create a simple question-answering program.

Step 1: Project Setup

First, add DSPEx and an HTTP client like Tesla to your mix.exs dependencies:

# mix.exs
def deps do
  [
    {:dspex, "~> 0.1.0"},
    {:tesla, "~> 1.4"},
    {:jason, "~> 1.4"} # For JSON processing
  ]
end

Run mix deps.get to install them.

Step 2: Configuration

Configure your foundation model provider. In your config/config.exs, add your API key. DSPEx uses the application environment to manage configuration securely.

# config/config.exs
import Config

config :dspex, :openai,
  api_key: System.get_env("OPENAI_API_KEY"),
  organization: System.get_env("OPENAI_ORGANIZATION_ID")

Step 3: Define the Signature

A Signature defines the inputs and outputs of your task. We’ll use the defsignature macro, which turns a simple string into a structured, compile-time module.

# lib/my_app/signatures.ex
defmodule MyApp.Signatures.SimpleQA do
  @moduledoc """
  Answers a question given some context.
  """
  use DSPEx.Signature, "context, question -> answer"
end

Behind the scenes, this macro creates a MyApp.Signatures.SimpleQA module with a struct %MyApp.Signatures.SimpleQA{} and sets the docstring as the high-level instructions for the language model. This corresponds to Diagram 4: Metaprogramming Flow.

Step 4: Start the LM Client

DSPEx clients are long-running, stateful processes (GenServers) that manage API connections. You need to add them to your application’s supervision tree. This ensures they are started, managed, and restarted automatically by OTP.

# lib/my_app/application.ex
defmodule MyApp.Application do
  use Application

  @impl true
  def start(_type, _args) do
    children = [
      # Start the LM client pool under the main supervisor.
      # This manages connections and API keys for OpenAI models.
      {DSPEx.Client.LM, name: MyApp.OpenAI_Client, provider: :openai}
    ]

    opts = [strategy: :one_for_one, name: MyApp.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

This step is a practical implementation of the architecture shown in Diagram 1, where the App Supervisor manages the LM Client.

Step 5: Create and Run the Program

Now, let’s use a built-in DSPEx.Predict module to run our task. Predict is a pre-built module that takes a signature and executes a single call to the LLM.

# lib/my_app.ex
defmodule MyApp do
  alias MyApp.Signatures.SimpleQA

  def run_qa do
    # Create a Predictor program with our signature.
    # We tell it which running client process to use.
    qa_program = %DSPEx.Predict{
      signature: SimpleQA,
      client: MyApp.OpenAI_Client
    }

    # Execute the program's forward pass.
    # This will spawn a process tree as shown in Diagram 2.
    result = DSPEx.Program.forward(qa_program, %{
      question: "What is the primary benefit of the BEAM virtual machine?",
      context: "The BEAM, Erlang's virtual machine, is designed for building concurrent, fault-tolerant, and distributed systems. Its key features include lightweight processes, preemptive scheduling, and a 'let it crash' philosophy with supervisors."
    })

    # The result is a DSPEx.Prediction struct.
    IO.puts("Answer: #{result.answer}")
  end
end

To run this, start an IEx session with iex -S mix and call MyApp.run_qa().

Part 2: Core Concepts in Detail

This section expands on the key abstractions that form the foundation of DSPEx.

`DSPEx.Program` (The Behaviour)

A Program is any module that implements the DSPEx.Program behaviour. This behaviour mandates a single callback: forward/2, which takes the module and an input map.

This allows for powerful composition. Let’s build a simple RAG program:

defmodule MyApp.RAG do
  @behaviour DSPEx.Program

  # Assume these are defined elsewhere
  alias MyApp.Retrieve
  alias MyApp.GenerateAnswer

  defstruct [:retriever, :generator]

  @impl true
  def forward(program, inputs) do
    # 1. Retrieve context
    retrieved_passages = DSPEx.Program.forward(program.retriever, %{query: inputs.question})

    # 2. Generate answer using the retrieved context
    final_inputs = Map.merge(inputs, %{context: retrieved_passages.passages})
    DSPEx.Program.forward(program.generator, final_inputs)
  end
end

Here, MyApp.RAG composes two other programs (retriever and generator), orchestrating the flow of data between them.

`DSPEx.Signature` (The Macro)

As seen before, defsignature is the primary way to define the I/O contract for a Program. You can add more detail to your fields to provide better instructions to the LLM.

defmodule MyApp.Signatures.DetailedQA do
  @moduledoc "You are a helpful AI assistant..."
  use DSPEx.Signature,
    """
    context, question -> answer
    """,
    context: [
      desc: "Factual passages that may contain the answer.",
      type: {:list, :string} # Type hints for clarity and potential validation
    ],
    question: [
      desc: "A specific question about the context."
    ],
    answer: [
      desc: "A concise answer, typically 1-2 sentences."
    ]
end

This compile-time transformation ensures that your program’s data structures are well-defined and efficient at runtime, as illustrated in Diagram 4.

`DSPEx.Client` (The GenServer)

Clients are the bridge to external services. They are designed as OTP GenServers for several reasons:

State Management: They hold configuration like API keys and base URLs in their process state, isolated from the rest of the application.
Concurrency Control: They can manage a pool of connections or a request queue to handle rate limiting gracefully.
Supervision: As part of the application’s supervision tree, they are automatically restarted if they crash, ensuring service availability.

When your program calls DSPEx.Program.forward, which in turn calls a client, the call is actually a message to this GenServer process. The GenServer then spawns a separate, short-lived Task to handle the blocking I/O (the HTTP request), freeing the client to accept other requests and preventing the I/O from blocking a critical part of your system. This is the pattern shown in Diagram 2.

Part 3: Leveraging the BEAM

This is where DSPEx truly shines compared to its Python counterpart.

Massively Parallel Evaluation

In DSPy, evaluating a program on a dataset uses a thread pool. In DSPEx, we use OTP’s lightweight processes, which are orders of magnitude more efficient.

The DSPEx.Evaluate.run/3 function leverages Task.async_stream to achieve this:

# dev_set is a list of DSPEx.Example structs.
# MyMetric.calculate/2 is your evaluation function.
# my_program is a module implementing the DSPEx.Program behaviour.

scores = DSPEx.Evaluate.run(my_program, dev_set, &MyMetric.calculate/2, max_concurrency: 100)

How it works (under the hood):

DSPEx.Evaluate.run takes your program and the list of examples.
It uses Task.async_stream to map over the dev_set.
For each example in the dev_set, Task.async_stream spawns a new, completely isolated process.
The BEAM scheduler runs these processes concurrently across all available CPU cores. If an LLM call involves a 1-second wait, the BEAM can manage tens of thousands of these “waiting” processes with minimal overhead.
If one evaluation process fails (e.g., due to a malformed example or a non-transient API error), it does not affect any other evaluation.

This architecture provides superior throughput and isolation for large-scale evaluations and optimizations, as visualized in the “Evaluation Worker Pool” in Diagram 3.

Resilience and “Let It Crash”

Consider a network hiccup during an LLM API call.

In traditional approaches: You would wrap the call in a try/catch block with complex retry logic, cluttering your core program.
In DSPEx: The Task process making the HTTP request is allowed to crash. The ProgramProcess that spawned it (see Diagram 2) is linked and can be configured to trap exits. When it receives the exit signal, its logic (not the LLM client’s logic) can decide to:
1. Log the error.
2. Retry the task immediately.
3. Retry the task after an exponential backoff.
4. Give up and return an error for that specific request.

This keeps the core business logic clean and delegates failure handling to dedicated supervisors, a cornerstone of OTP design.

Stateful Optimizers (`Teleprompter`)

Optimizers like MIPROv2 or GRPO are long-running, stateful jobs. In DSPEx, these are robust GenServers.

State: The GenServer holds the state of the optimization (e.g., candidate prompts, scores, generation counts). This state is protected from concurrent access.
Work: For each optimization step, the GenServer spawns thousands of evaluation Tasks in parallel to score candidate programs (as seen in Diagram 3).
Resilience: If the node running the DSPEx application were to restart, the main application supervisor would restart the Teleprompter GenServer. With persistence logic added (e.g., periodically saving state to disk), the optimization could even resume from where it left off.

DSPEx Documentation