← Back to 002 foundation enhancements

008 7of10

Documentation for 008_7of10 from the Ds ex repository.

Of course. Here is a detailed technical document for the seventh proposed feature enhancement: “Test Isolation & Mocking Infrastructure.”


Technical Specification: Test Isolation & Mocking Infrastructure

Document Version: 1.0 Author: AI Assistant Status: PROPOSED

1. Overview

This document specifies a set of enhancements to the foundation library to provide first-class support for test isolation and service mocking. The current ServiceRegistry supports namespacing with {:test, ref}, which is the correct primitive for isolation. This proposal builds upon that primitive to create a high-level, ergonomic API for developers to write clean, concurrent, and reliable tests for applications built on foundation.

The primary driver for this enhancement is the DSPEx project. Testing a DSPEx program requires replacing its “real” LM.Client (which makes paid API calls) with a “mock” client that returns predictable, canned responses. This must be done in a way that is safe for concurrent test execution, ensuring that one test’s mock does not interfere with another’s.

This enhancement will introduce a new Foundation.TestSupport module containing a with_mocked_service/3 macro, which will become the standard, idiomatic way to test service-dependent code.


2. Problem Statement & Use Case

A developer is writing tests for a DSPEx.Predict module. They have two tests that need to run concurrently:

  1. Test A: Tests the “happy path.” It needs the mocked LM.Client to return {:ok, %{"content" => "Paris"}}.
  2. Test B: Tests a failure path. It needs the mocked LM.Client to return {:error, :rate_limited}.

Without a proper isolation and mocking framework, this is difficult and dangerous to achieve:

  • Global Mocks: Using a library like Meck to globally swap the real LM.Client module with a mock creates a race condition. If Test A and Test B run at the same time, one test’s mock will overwrite the other’s, leading to flaky, unpredictable test failures.
  • Manual Setup: A developer could manually start a mock GenServer for each test and pass its pid directly to the DSPEx.Predict module. This works but is verbose and requires the developer to handle the setup and teardown boilerplate in every single test.

The goal is to provide a simple, safe, and declarative way to say: “for the duration of this test block only, whenever any code looks up the :openai_client service, give it my mock process instead.”


3. Proposed API and Architecture

3.1. Foundation.TestSupport.with_mocked_service/3 (New Macro)

This will be the primary public API for this feature. It’s a macro that provides a sandboxed execution context.

File: lib/foundation/test_support.ex

New Macro:

defmodule Foundation.TestSupport do
  @moduledoc "Utilities for testing Foundation-based applications."

  @doc """
  Executes a block of code with a service temporarily replaced by a mock.

  This macro ensures that any lookup for the specified `service_name` within
  the `do` block will resolve to the provided `mock_pid`. The mocking is
  isolated to the current test process and is safe for concurrent execution.

  ## Parameters
    - `service_tuple`: A `{namespace, service_name}` tuple identifying the service to mock.
    - `mock_pid`: The PID of the GenServer or process to use as the mock.
    - `block`: The block of code to execute within the mocked context.
  """
  defmacro with_mocked_service({namespace, service_name}, mock_pid, do: block) do
    quote do
      # 1. Get the unique test reference from the ExUnit context.
      test_ref = Foundation.TestSupport.get_test_ref()
      test_namespace = {:test, test_ref}
      
      # 2. Register the mock PID under the test's unique namespace.
      #    We register it with the *production* service name.
      Foundation.ServiceRegistry.register(test_namespace, unquote(service_name), unquote(mock_pid))
      
      # 3. Temporarily override the process dictionary to point to our test namespace.
      original_namespace = Foundation.ServiceRegistry.get_current_namespace()
      Foundation.ServiceRegistry.set_current_namespace(test_namespace)
      
      try
        # 4. Execute the user's test code.
        unquote(block)
      after
        # 5. Restore the original namespace and clean up.
        Foundation.ServiceRegistry.set_current_namespace(original_namespace)
        # The on_exit hook from setup_isolated_dspy_environment will handle the unregistering.
      end
    end
  end

  # Helper to get the test ref, assumes it's set in a setup block.
  def get_test_ref do
    Process.get(:current_test_ref) || raise "No test reference found. Did you use setup_isolated_dspy_environment()?"
  end
end
3.2. Supporting Changes in ServiceRegistry

To support the macro, ServiceRegistry needs two new functions to manage the “current” namespace for a given process.

File: foundation/service_registry.ex

defmodule Foundation.ServiceRegistry do
  # ... existing functions ...
  
  @doc "Sets the active namespace for the calling process."
  def set_current_namespace(namespace) do
    Process.put(:foundation_current_namespace, namespace)
  end

  @doc "Gets the active namespace for the calling process, defaulting to :production."
  def get_current_namespace do
    Process.get(:foundation_current_namespace, :production)
  end
  
  # The `lookup` function must be modified to use this new process-local value.
  @spec lookup(service_name :: atom()) :: {:ok, pid()} | {:error, Error.t()}
  def lookup(service_name) do
    # This function no longer takes a namespace; it discovers it.
    namespace = get_current_namespace()
    
    # First, try to look up in the current (potentially test) namespace.
    case ProcessRegistry.lookup(namespace, service_name) do
      {:ok, pid} ->
        {:ok, pid}
      
      :error when namespace != :production ->
        # If in a test namespace and not found, fall back to production namespace.
        # This allows mocking only a subset of services while others remain real.
        ProcessRegistry.lookup(:production, service_name)

      :error ->
        # Not found in production namespace either.
        {:error, create_service_not_found_error(namespace, service_name)}
    end
  end
end

Architectural Change: The lookup function is now “context-aware.” It first checks the process-local namespace (which with_mocked_service sets to a test-specific one) and then falls back to the :production namespace. This allows a test to mock just one service while still being able to access other real, shared services.

3.3. Standard Test Setup

A standard test setup block will be provided to make this seamless.

# in foundation/test_support.ex (or dspex/test/support.ex)

defmacro __using__(_opts) do
  quote do
    use ExUnit.Case, async: true
    import Foundation.TestSupport

    setup do
      test_ref = make_ref()
      Process.put(:current_test_ref, test_ref)
      
      # Optional: Start common services in the test namespace if needed.
      # start_supervised!({MyService, namespace: {:test, test_ref}})
      
      on_exit(fn ->
        # This is crucial for cleanup after each test.
        Foundation.ServiceRegistry.cleanup_test_namespace(test_ref)
      end)
      
      :ok
    end
  end
end

# Example usage in a test file
defmodule MyApp.MyTest do
  use MyApp.DataCase # This would 'use Foundation.TestSupport'
  
  test "..." do
    # ...
  end
end

4. Example DSPEx Test Case

Here is how a developer would write a test for a DSPEx program using this new infrastructure.

# in test/dspex/predict_test.exs

defmodule DSPEx.PredictTest do
  use DSPEx.TestCase # Assumes this uses the Foundation.TestSupport setup

  # Define a simple mock LM client
  defmodule MockLM do
    use GenServer
    def start_link(opts), do: GenServer.start_link(__MODULE__, opts, [])
    def init(opts), do: {:ok, opts}
    
    def handle_call({:request, _messages, _opts}, _from, state) do
      # Return a canned response based on the test's needs
      response = Keyword.get(state, :response, {:ok, %{"choices" => [%{"message" => %{"content" => "mocked answer"}}]}})
      {:reply, response, state}
    end
  end

  test "forward returns ok tuple on successful LM call" do
    # 1. Start a mock LM GenServer configured for this specific test case.
    {:ok, mock_lm_pid} = MockLM.start_link(response: {:ok, %{"content" => "Test Passed"}})
    
    # 2. Use the macro to sandbox the execution.
    #    We mock the :openai_client service which is registered in the :production namespace.
    with_mocked_service({:production, :openai_client}, mock_lm_pid) do
      # 3. Inside this block, any code that calls `ServiceRegistry.lookup(:openai_client)`
      #    will receive `mock_lm_pid`.
      
      # The Predict module is configured to look up the :openai_client by name.
      predict_program = %DSPEx.Predict{signature: MyTestSignature, client: :openai_client}
      
      # Run the program.
      result = DSPEx.Program.forward(predict_program, %{question: "test"})
      
      assert {:ok, %DSPEx.Prediction{answer: "Test Passed"}} = result
    end
  end
end

5. Conclusion

This enhancement provides a clean, safe, and ergonomic API for a fundamental requirement of application development: testing with mocks. By building this on top of the existing ServiceRegistry and its namespacing capabilities, we can offer powerful test isolation that works seamlessly with concurrent test runners.

For DSPEx, this is critical. It allows developers to:

  • Write fast, reliable unit tests for their programs without making real API calls.
  • Test complex failure and retry logic by having mocks return specific error tuples.
  • Run their entire test suite in parallel (mix test --async) with confidence that tests will not interfere with each other.

This feature elevates foundation from a set of production utilities to a complete application development framework that considers the entire development lifecycle, including testing.