Of course. This is the perfect point to consolidate the refined strategy into a set of ground-up technical documents. The goal is to create a clear, pragmatic, and technically sound proposal that can be presented to senior Elixir engineers.
This first document will provide a detailed implementation plan that can also serve as a progress checklist. It maps our proposed Elixir components to their counterparts in the existing Python dspy
codebase, ensuring we capture the essential logic while leveraging Elixir’s strengths. It concludes with a deep-dive specification for the most foundational component: the resilient API client.
Document 1: The Resilient Client & Core Execution Engine
Objective: To provide a detailed, actionable specification for the foundational layer of DSPEx
. This includes a resilient, stateful, and concurrent API client, and the core Program
behaviour that all executable modules will implement. This document serves as the primary technical guide for building and testing the bedrock of the framework.
Dependency Graph
This graph illustrates the layered dependencies of the MVP components. We will build from the bottom up, starting with the Client
and Primitives
.
Implementation Progress Checklist
This section breaks down the MVP into concrete implementation tasks, mapping each to its corresponding logic in the Python dspy
codebase.
[ ] Layer 0: The Foundation
- [ ]
DSPEx.Client
(GenServer)- Description: A stateful, resilient wrapper around HTTP calls to LLM providers. Manages API keys, caching, and circuit breaking for a specific provider configuration.
- Mapping to
dspy
: This component centralizes logic that is distributed indspy
.- HTTP Request Logic: Maps to the
litellm.completion
andlitellm.acompletion
calls withindspy/clients/lm.py
. Ourhttp_post
function will be aReq
-based implementation of this. - Caching: Maps to the
@request_cache
decorator anddspy.Cache
class found indspy/clients/cache.py
. Ourhandle_call
will useCachex
to achieve the same result. - Resilience: This is an enhancement.
dspy
relies on LiteLLM’s built-in retry mechanisms.DSPEx.Client
will useFuse
for more robust, stateful circuit breaking, which is a significant improvement. - State Management (API Keys, etc.): Maps to the
__init__
method ofdspy.LM
indspy/clients/lm.py
, wheremodel
,api_key
, and otherkwargs
are stored on the object instance. OurGenServer
state will hold this.
- HTTP Request Logic: Maps to the
[ ] Layer 1: Abstractions & Primitives
[ ]
DSPEx.Program
(Behaviour)- Description: A behaviour that defines the
forward/2
andconfigure/2
contract for all executable modules. This ensures anyDSPEx
module can be used by theEvaluate
andTeleprompter
components. - Mapping to
dspy
: Corresponds to thedspy.Module
class indspy/primitives/program.py
. Theforward
method is a direct conceptual map. Theconfigure
callback is an Elixir-idiomatic equivalent of creating a new, modified instance of a module.
- Description: A behaviour that defines the
[ ]
DSPEx.Signature
(Macro)- Description: A compile-time macro (
defsignature
) for declaratively defining the I/O contract of a program. - Mapping to
dspy
: Maps to thedspy.Signature
metaclass implementation indspy/signatures/signature.py
. The Python version uses Pydantic fields and a metaclass to build the signature at class definition time. Our Elixir macro will achieve the same goal of compile-time definition, but produce a more lightweight struct and module.
- Description: A compile-time macro (
[ ]
DSPEx.Adapter
(Behaviour)- Description: Defines a contract for translating a
Signature
and input data into an LLM-specific message format and parsing the response back. - Mapping to
dspy
: Maps to thedspy.Adapter
base class and its subclasses (ChatAdapter
,JSONAdapter
) indspy/adapters/
. Ourformat/3
callback is equivalent to theformat
method, and ourparse/2
is equivalent to theparse
method.
- Description: Defines a contract for translating a
[ ]
DSPEx.Example
&DSPEx.Prediction
(Structs)- Description: Standardized data containers for training/dev data (
Example
) and program outputs (Prediction
). - Mapping to
dspy
:DSPEx.Example
maps directly todspy.Example
indspy/primitives/example.py
. The concept of.with_inputs()
is identical.DSPEx.Prediction
maps directly todspy.Prediction
indspy/primitives/prediction.py
. The idea of storing inputs, outputs, and raw responses is the same.
- Description: Standardized data containers for training/dev data (
[ ] Layer 2: Core Execution
- [ ]
DSPEx.Predict
- Description: The simplest
Program
implementation. Orchestrates a single call to an LLM via theAdapter
andClient
. - Mapping to
dspy
: Maps to thedspy.Predict
class indspy/predict/predict.py
. Itsforward
method is a direct conceptual equivalent to ourforward/2
implementation, orchestrating the call to the adapter and the LM.
- Description: The simplest
[ ] Layer 3: Evaluation
- [ ]
DSPEx.Evaluate
- Description: A concurrent evaluation engine that runs a program against a dev set using
Task.async_stream
. - Mapping to
dspy
: Maps to thedspy.Evaluate
class indspy/evaluate/evaluate.py
. The key difference is the concurrency model.dspy
uses aParallelExecutor
with aThreadPoolExecutor
(dspy/utils/parallelizer.py
), whereas we will use OTP’sTask
primitives for superior I/O-bound concurrency and fault isolation.
- Description: A concurrent evaluation engine that runs a program against a dev set using
[ ] Layer 4: Optimization
- [ ]
DSPEx.Teleprompter.BootstrapFewShot
- Description: An optimizer that generates few-shot examples for a
student
program by running ateacher
program over a training set. - Mapping to
dspy
: Maps to thedspy.BootstrapFewShot
class indspy/teleprompt/bootstrap.py
. The core logic of thecompile
methodārunning a teacher, filtering successful traces, and creating demosāis identical. Our implementation will leverage our concurrentDSPEx.Evaluate
engine to do this efficiently.
- Description: An optimizer that generates few-shot examples for a
Deep Dive Specification: DSPEx.Client
This is the most foundational runtime component. It must be robust, efficient, and testable in isolation.
File: lib/dspex/client.ex
defmodule DSPEx.Client do
@moduledoc """
A stateful, resilient GenServer for making requests to an LLM API.
Each `Client` process is configured for a specific LLM provider and model
(e.g., OpenAI's gpt-4o) and manages its own cache, API keys, and
circuit breaker state. This ensures that contention or failures with one
provider do not affect others.
"""
use GenServer
require Logger
alias Fuse.Circuit
# =================================================================
# Public API
# =================================================================
@doc """
Starts a `DSPEx.Client` GenServer.
## Options
* `:name` (required) - A unique atom to register the client process.
* `:adapter` (required) - The adapter module used for formatting requests.
* `:config` (required) - A map containing provider-specific configuration.
* `:api_key` - The API key.
* `:base_url` - The API base URL.
* `:model` - The specific model string (e.g., "gpt-4o-mini").
"""
def start_link(opts) do
GenServer.start_link(__MODULE__, opts, name: Keyword.fetch!(opts, :name))
end
@doc """
Makes a synchronous request to the LLM API.
This is the primary way to interact with the client. It will block until
a response is received, the request times out, or the circuit breaker
is open.
"""
def request(client_name, messages, opts \\ %{}) when is_atom(client_name) do
# Default timeout of 30 seconds for the GenServer call.
GenServer.call(client_name, {:request, messages, opts}, 30_000)
end
# =================================================================
# GenServer Callbacks
# =================================================================
@impl true
def init(opts) do
# Extract config from opts passed to start_link/1
name = Keyword.fetch!(opts, :name)
config = Keyword.fetch!(opts, :config)
# Set up names for dependent services, scoped to this client
fuse_name = :"fuse_#{name}"
cache_name = :"cache_#{name}"
# Start the cache for this client
:ok = Cachex.start_link(name: cache_name)
# Configure and install the circuit breaker
# If the function fails 5 times in a 10-second window, the fuse melts (opens).
# It will stay open for 5 seconds before trying again.
fuse_opts = [
strategy: {Fuse.Strategy.ExponentialBackoff, {5_000, 2.0}},
reset_timeout: 10_000,
failure_threshold: 5
]
:ok = Fuse.install(fuse_name, fuse_opts)
Logger.info("DSPEx.Client #{inspect(name)} started.")
state = %{
name: name,
config: config,
fuse_name: fuse_name,
cache_name: cache_name
}
{:ok, state}
end
@impl true
def handle_call({:request, messages, _opts}, _from, state) do
# The core logic: try the cache, then the network via the fuse.
cache_key = build_cache_key(messages, state.config)
response =
Cachex.get_or_set(state.cache_name, cache_key, fn ->
# This function is only executed on a cache miss.
# It's wrapped in a circuit breaker for resilience.
Fuse.call({:default, state.fuse_name}, fn ->
http_post(messages, state.config)
end)
end)
{:reply, response, state}
end
# =================================================================
# Private Helpers
# =================================================================
@spec http_post(list(map()), map()) :: {:ok, map()} | no_return()
defp http_post(messages, config) do
body = %{
model: config.model,
messages: messages,
# Add other parameters like temperature, max_tokens if needed
}
headers = [
{"Authorization", "Bearer #{config.api_key}"},
{"Content-Type", "application/json"}
]
case Req.post(config.base_url, json: body, headers: headers) do
{:ok, %{status: 200, body: response_body}} ->
# On success, return the parsed body.
{:ok, response_body}
{:ok, %{status: status, body: body}} ->
# For non-200 responses, raise an error that Fuse can catch.
Logger.error("LLM API Error: Status #{status}, Body: #{inspect(body)}")
raise "LLM API request failed with status #{status}"
{:error, reason} ->
# For network-level errors, raise an error that Fuse can catch.
Logger.error("HTTP Request Error: #{inspect(reason)}")
raise "HTTP request failed: #{inspect(reason)}"
end
end
@spec build_cache_key(any, map) :: binary()
defp build_cache_key(messages, config) do
# Create a deterministic term to hash for the cache key.
# This ensures that identical requests produce the same cache key.
term_to_hash = {messages, config.model}
:crypto.hash(:sha256, :erlang.term_to_binary(term_to_hash))
end
end
Testing Strategy for DSPEx.Client
This module can and should be tested in complete isolation before any other components are built.
Unit Test with
Mox
:- Define a
ReqMock
behaviour and useMox
to mock theReq.post/2
function. - Test Case 1 (Success): Assert that
DSPEx.Client.request/2
callsReq.post/2
with the correct URL, headers, and body. Verify that it returns{:ok, response_body}
when the mock returns a 200 OK. - Test Case 2 (API Error): Configure the
ReqMock
to return a 4xx or 5xx status. Assert that therequest/2
call returns{:error, {:fuse_melted, _}}
after the threshold is reached. - Test Case 3 (Network Error): Configure the
ReqMock
to return{:error, :econnrefused}
. Assert the behaviour is the same as the API error case.
- Define a
Cache Integration Test:
- Using a real
Cachex
instance, callDSPEx.Client.request/2
twice with the exact same arguments. - Use
Mox
toexpect
theReqMock.post/2
function to be called exactly once. - Verify that the result of both calls is identical.
- Using a real
Fuse Integration Test:
- Set the
failure_threshold
in the Fuse configuration to a low number (e.g., 2) for the test environment. - Configure the
ReqMock
to consistently return an error. - Call
request/2
three times. Assert the first two calls return an error after a delay (as they attempt the HTTP call) and the third call returns{:error, {:fuse_melted, _}}
almost instantly, proving the circuit is open.
- Set the