← Back to 004 foundation

05 gemini focusedWithDeps

Documentation for 05_gemini_focusedWithDeps from the Ds ex repository.

Excellent. This is the perfect pivot. The community feedback, while direct, provides a valuable opportunity to ground the project in pragmatic, ecosystem-aware principles. We will now create a detailed design for the Core Execution and Optimization Loop of DSPEx, explicitly leveraging the fantastic tools the Elixir community has already built.

This plan is focused, achievable, and demonstrates a deep respect for the existing ecosystem, addressing the core technical concerns that were raised.


Technical Design: DSPEx Core Execution & Optimization Loop

1. Core Philosophy & Guiding Principles

This design adheres to a “leverage, don’t reinvent” philosophy. The value of DSPEx is not in building generic infrastructure, but in providing a BEAM-native implementation of the DSPy programming model for orchestrating and optimizing AI programs.

  1. Minimal Core: We will build the smallest possible set of components needed to replicate DSPy’s Predict -> Evaluate -> BootstrapFewShot workflow.
  2. Ecosystem First: We will use best-in-class, community-vetted libraries for all non-domain-specific tasks (HTTP, caching, resilience, etc.).
  3. Pragmatic Concurrency: We will apply OTP’s powerful concurrency features where they provide the most leverage: in parallel evaluation and optimization, which is the true bottleneck in the DSPy workflow.

2. Core Components & Chosen Libraries

This is the stack for our MVP. It is lean, robust, and uses the tools an experienced Elixir developer would reach for.

DSPEx ComponentRoleChosen Elixir LibraryRationale
DSPEx.ClientHTTP communication with LM providersReqModern, composable, and built on top of Finch for high-performance connection pooling. Perfect for our needs.
DSPEx.Client (Resilience)Circuit Breaker for external API callsFuseThe standard, battle-tested library for implementing circuit breakers in Elixir. Simple and effective.
DSPEx.CacheIn-memory caching for LM responsesCachexA powerful, fast, and mature caching library with TTL support, which is all we need for the initial implementation.
DSPEx.Evaluate (Concurrency)Parallel execution of program over a datasetTask.async_streamThe native OTP solution for “embarrassingly parallel” I/O-bound tasks. More efficient and scalable than thread pools.
DSPEx.Evaluate (UX)Console progress barprogress_barThe standard library for providing user feedback during long-running command-line tasks.
DSPEx.Shared (Parsing)JSON parsingJasonThe de-facto standard for high-performance JSON processing in Elixir.

3. Architectural Diagrams

Diagram 1: Static Component Architecture

This diagram shows the relationship between our DSPEx modules and the external libraries they will leverage.

graph TD subgraph "User's Application" A["User Code
(e.g., \`MyProgram.run()\`)"] end subgraph "DSPEx Core Logic" B["DSPEx.Predict Module"] C["DSPEx.Evaluate Module"] D["DSPEx.Client GenServer"] E["DSPEx.Cache GenServer"] end subgraph "Elixir Ecosystem Libraries" F["Req (HTTP Client)"] G["Fuse (Circuit Breaker)"] H["Cachex (Cache Store)"] I["Task (OTP Concurrency)"] J["Jason (JSON Parser)"] end A --> B A --> C C -- "Uses" --> B C -- "Uses" --> I B -- "Uses" --> D B -- "Uses" --> E D -- "Wraps calls in" --> G G -- "Executes" --> F F -- "Uses" --> J E -- "Uses" --> H style D fill:#f9f,stroke:#333,stroke-width:2px style E fill:#f9f,stroke:#333,stroke-width:2px
Diagram 2: Dynamic Flow of DSPEx.Evaluate.run/3

This sequence diagram illustrates the power of the BEAM. The Evaluate module orchestrates a massively parallel workload, where each task is independently resilient.

sequenceDiagram participant User participant Evaluate participant Task_1 participant Task_N participant Predict participant Client participant Fuse User->>+Evaluate: run(program, dev_set, metric) Note over Evaluate: Starts Task.async_stream over dev_set par Evaluate->>+Task_1: spawn(fn -> Predict.forward(program, example_1) end) Task_1->>+Predict: Predict.forward(program, example_1) Predict->>+Client: request(messages) Client->>+Fuse: call(..., fn -> Req.post(...) end) Fuse-->>-Client: {:ok, response} Client-->>-Predict: {:ok, response} Predict-->>-Task_1: {:ok, prediction} Task_1-->>-Evaluate: {:ok, result_1} and Evaluate->>+Task_N: spawn(fn -> Predict.forward(program, example_n) end) Task_N->>+Predict: Predict.forward(program, example_n) Predict->>+Client: request(messages) Client->>+Fuse: call(..., fn -> Req.post(...) end) Note over Fuse: API fails, Fuse trips! Fuse-->>-Client: {:error, :circuit_open} Client-->>-Predict: {:error, :circuit_open} Predict-->>-Task_N: {:error, reason} Task_N-->>-Evaluate: {:ok, failed_result} end Evaluate->>Evaluate: Aggregates all results (successes and failures) Evaluate-->>-User: Returns final evaluation score

4. Code Implementation Skeletons

This is how the modules will be structured.

DSPEx.Client - The Resilient API Caller
defmodule DSPEx.Client do
  use GenServer

  # --- Public API ---
  def start_link(opts) do
    # Name from opts, e.g., MyApp.OpenAIClient
    GenServer.start_link(__MODULE__, opts, name: opts[:name])
  end

  def request(client, messages, opts \\ %{}) do
    GenServer.call(client, {:request, messages, opts})
  end

  # --- GenServer Implementation ---
  @impl true
  def init(opts) do
    # config contains api_key, base_url, etc.
    config = Keyword.fetch!(opts, :config)
    # fuse_name allows per-provider circuit breakers
    fuse_name = :"fuse_#{opts[:name]}"
    {:ok, %{config: config, fuse_name: fuse_name}}
  end

  @impl true
  def handle_call({:request, messages, _opts}, _from, state) do
    # The core logic: wrap a Req call inside a Fuse call.
    # Fuse will automatically handle the circuit breaking logic.
    response =
      Fuse.call({:default, state.fuse_name}, fn ->
        # Use Req to build and execute the request.
        Req.post!(
          state.config.base_url,
          body: Jason.encode!(%{messages: messages, model: state.config.model}),
          headers: %{"Authorization" => "Bearer #{state.config.api_key}"}
        )
      end)

    # Process the response from Fuse
    reply =
      case response do
        {:ok, %Req.Response{status: 200, body: body}} ->
          {:ok, body}
        {:ok, %Req.Response{status: status, body: body}} ->
          {:error, {:http_error, status, body}}
        {:error, :circuit_open} ->
          {:error, :circuit_open}
        {:error, reason} ->
          {:error, {:fuse_error, reason}}
      end

    {:reply, reply, state}
  end
end
DSPEx.Evaluate - The Concurrent Engine
defmodule DSPEx.Evaluate do
  alias DSPEx.Program

  def run(program, devset, metric_fun, opts \\ []) do
    num_threads = Keyword.get(opts, :num_threads, System.schedulers_online() * 2)

    # Setup progress bar
    pbar = ProgressBar.new(total: Enum.count(devset))

    results =
      devset
      |> Task.async_stream(
        fn example ->
          # This is the work done by each concurrent process
          prediction = Program.forward(program, example.inputs)
          score = metric_fun.(example, prediction)
          ProgressBar.tick(pbar)
          {example, prediction, score}
        end,
        max_concurrency: num_threads,
        ordered: false,
        on_timeout: :kill_task
      )
      |> Enum.to_list() # Collect all results

    ProgressBar.done(pbar)

    # Aggregate scores from the results
    # ... logic to calculate average score, etc.
  end
end

5. Phased Rollout Plan

  1. Phase 1: DSPEx.Predict End-to-End

    • Implement DSPEx.Signature macro (minimal version).
    • Implement DSPEx.Client using Req and Fuse.
    • Implement DSPEx.Cache using Cachex.
    • Implement DSPEx.Predict to tie it all together.
    • Deliverable: A user can define a QA signature, configure a client, and get a single, resilient prediction.
  2. Phase 2: DSPEx.Evaluate and BootstrapFewShot

    • Implement DSPEx.Evaluate using Task.async_stream and progress_bar.
    • Implement DSPEx.ChainOfThought.
    • Implement the DSPEx.Teleprompter.BootstrapFewShot optimizer, which will use DSPEx.Evaluate as its core engine.
    • Deliverable: A user can now evaluate their programs for performance and run the first optimizer to automatically generate few-shot examples, closing the core DSPy loop.

This refocused plan is technically sound, demonstrates deep understanding of the Elixir ecosystem, and provides a clear, incremental path to delivering a powerful and valuable library. It directly addresses the community’s feedback by building on the ecosystem, not in spite of it.