← Back to 006 foundation

02 gemini

Documentation for 02_gemini from the Ds ex repository.

Excellent. Let’s continue the deep dive into the foundational module, DSPEx.Client.

The previous response established its architecture. Now, we will focus on the implementation specifics and the rationale behind the Elixir-native patterns, contrasting them directly with the Python source to provide a clear implementation guide.

DSPEx.Client: A Deeper Dive into the Foundation

The core philosophy behind DSPEx.Client is to treat each configured LLM provider as a resilient, stateful, and supervised microservice within the application. This contrasts with the Python approach where an LM object is a more passive container for configuration.

The Rationale for a GenServer-based Design

Using a GenServer is not just an Elixir-for-the-sake-of-Elixir choice; it provides tangible benefits that are perfectly suited for this problem:

  1. State Isolation: Each client process (e.g., :openai_client, :anthropic_client) has its own state, including its API key, cache, and circuit breaker status. A series of failures causing the :anthropic_client’s fuse to melt will have zero impact on the :openai_client, which continues to operate normally. This is a level of runtime isolation that is difficult to achieve with simple class instances in Python.
  2. Controlled Concurrency: While many different application processes can call DSPEx.Client.request/2 concurrently, the GenServer’s single-threaded message loop ensures that access to its state (especially the cache and fuse) is serialized and safe without needing explicit locks.
  3. Supervision and Fault Tolerance: In a full OTP application, these Client processes would be started under a supervisor. If a Client crashes due to a catastrophic, unexpected error (e.g., a bug in the JSON parsing library for a malformed response), the supervisor would automatically restart it to a clean initial state, making the system self-healing.

Detailed Implementation Breakdown & Code Mapping

Let’s walk through the DSPEx.Client module, function by function, and map its logic to the dspy codebase.

File: lib/dspex/client.ex

1. Public API: start_link/1 and request/2

defmodule DSPEx.Client do
  use GenServer
  # ... aliases and require Logger ...

  @doc "Starts and links a Client GenServer."
  def start_link(opts) do
    # Name is mandatory for other processes to find this client.
    name = Keyword.fetch!(opts, :name)
    GenServer.start_link(__MODULE__, opts, name: name)
  end

  @doc "Makes a synchronous API request."
  def request(client_name, messages, opts \\ %{}) when is_atom(client_name) do
    # A 30-second timeout is a sensible default for an LLM call.
    GenServer.call(client_name, {:request, messages, opts}, 30_000)
  end

  # ... GenServer callbacks below ...
end
  • start_link/1: This is the standard entry point for starting an OTP process. The :name option is crucial, as it registers the process with a well-known name (e.g., :gpt4o_client), allowing other parts of the application to interact with it without knowing its process ID (PID).

    • Python Mapping: Conceptually maps to the instantiation of an LM object: my_lm = dspy.LM(...) in dspy/clients/lm.py. In Elixir, we are not just creating a data structure, but spawning a living process.
  • request/2: This is the user-facing function. It’s a synchronous GenServer.call, meaning the calling process will block and wait for a reply. This is the desired behavior for a Predict module that needs the LLM’s answer to proceed.

    • Python Mapping: This directly maps to an LM instance being called like a function: my_lm(prompt=...), which triggers the __call__ method in dspy/clients/base_lm.py (Lines 61-64).

2. GenServer Callbacks: init/1 and handle_call/3

# lib/dspex/client.ex (continued)

@impl true
def init(opts) do
  name = Keyword.fetch!(opts, :name)
  config = Keyword.fetch!(opts, :config)

  # Scope dependencies to this specific client process
  fuse_name = :"fuse_#{name}"
  cache_name = :"cache_#{name}"

  # Start a dedicated cache for this client
  # This isolates the cache for different models/providers
  :ok = Cachex.start_link(name: cache_name)

  # Install a dedicated circuit breaker
  fuse_opts = [
    strategy: {Fuse.Strategy.ExponentialBackoff, {5_000, 2.0}}, # 5s base, x2 factor
    reset_timeout: 10_000, # Tries to reset after 10s of being open
    failure_threshold: 5   # Melts after 5 failures in the `reset_timeout` window
  ]
  :ok = Fuse.install(fuse_name, fuse_opts)

  Logger.info("DSPEx.Client #{inspect(name)} started.")

  state = %{
    name: name,
    config: config,
    fuse_name: fuse_name,
    cache_name: cache_name
  }

  {:ok, state}
end

@impl true
def handle_call({:request, messages, _opts}, _from, state) do
  cache_key = build_cache_key(messages, state.config)

  response =
    Cachex.get_or_set(state.cache_name, cache_key, fn ->
      # This function is ONLY executed on a cache miss.
      Fuse.call({:default, state.fuse_name}, fn ->
        # The network call is wrapped by the fuse.
        http_post(messages, state.config)
      end)
    end)

  {:reply, response, state}
end
  • init/1: This is the constructor for the process. It sets up all dependencies.

    • Python Mapping: This aggregates setup logic found in multiple places. The API key/model config comes from dspy.LM.__init__ (clients/lm.py, Lines 17-37). The cache setup is analogous to the global DSPY_CACHE initialization in dspy/clients/__init__.py (Lines 36-41), but here it’s per-process, providing better isolation. The Fuse setup is a direct replacement and enhancement for the num_retries logic passed to litellm (clients/lm.py, Line 168).
  • handle_call/3: This is the core logic, beautifully expressing the “Cache -> Fuse -> Network” pipeline.

    • Python Mapping: This is the heart of the operation. It maps to the @request_cache decorator in dspy/clients/cache.py (Lines 116-172) and the litellm.completion call in dspy/clients/lm.py (Lines 166-174). The Elixir version makes the flow explicit and linear rather than spread across decorators and function calls.

3. Private Helpers: http_post/2 and build_cache_key/2

# lib/dspex/client.ex (continued)

defp http_post(messages, config) do
  body = %{ model: config.model, messages: messages }
  headers = [
    {"Authorization", "Bearer #{config.api_key}"},
    {"Content-Type", "application/json"}
  ]

  case Req.post(config.base_url, json: body, headers: headers) do
    # The happy path
    {:ok, %{status: 200, body: response_body}} ->
      {:ok, response_body}

    # Any other successful HTTP call with a non-200 status is an API error
    {:ok, %{status: status, body: body}} ->
      Logger.error("LLM API Error: Status #{status}, Body: #{inspect(body)}")
      # We MUST raise here for Fuse to register it as a failure.
      raise "API request failed with status #{status}"

    # A network-level error
    {:error, reason} ->
      Logger.error("HTTP Request Error: #{inspect(reason)}")
      raise "HTTP request failed: #{inspect(reason)}"
  end
end

defp build_cache_key(messages, config) do
  # The term must be deterministic. Maps with atom keys are fine here.
  term_to_hash = {messages, config.model}
  # Create a secure hash of the binary representation of the term.
  :crypto.hash(:sha256, :erlang.term_to_binary(term_to_hash))
end
  • http_post/2: The direct network interaction layer.

    • Python Mapping: This is the Elixir replacement for the litellm.completion call (clients/lm.py, Lines 166-174). By using Req directly, we gain full control and avoid an extra dependency. The crucial part is raise. In dspy, litellm handles retries internally and then may raise its own exception. Here, we explicitly raise on any non-200 response to signal a failure to Fuse.
  • build_cache_key/2: The strategy for creating a unique, repeatable cache key.

    • Python Mapping: This maps to Cache.cache_key in dspy/clients/cache.py (Lines 61-88). The Python version manually traverses the dictionary to serialize it to JSON for hashing. Elixir’s :erlang.term_to_binary/1 is a more powerful, built-in primitive that can deterministically serialize almost any Elixir term, making the implementation simpler and more robust.

Conclusion: A More Resilient Foundation

By replacing the simple dspy.LM class with a supervised, stateful DSPEx.Client GenServer, the Elixir port establishes a foundation that is inherently more resilient, concurrent, and observable than its Python counterpart. This design fully embraces the strengths of the BEAM to handle the challenges of distributed systems, such as interacting with potentially unreliable third-party LLM APIs. This foundational module is now fully specified and ready for implementation and testing.