Task: LLM.3 - HTTP Adapter Implementation
Context
You are implementing the HTTP adapter for DSPex, which provides direct HTTP communication with LLM APIs. This adapter is optimized for simple completions with minimal overhead and maximum performance.
Required Reading
1. Existing HTTP Adapter
- File:
/home/home/p/g/n/dspex/lib/dspex/llm/adapters/http.ex
- Review current implementation approach
- Note HTTP client usage (Finch, HTTPoison, etc.)
2. LLM Adapter Protocol
- File:
/home/home/p/g/n/dspex/docs/specs/immediate_implementation/prompts/LLM.1_adapter_protocol.md
- Review protocol requirements
- Focus on simple string generation
3. Adaptive LLM Architecture
- File:
/home/home/p/g/n/dspex/docs/specs/dspex_cognitive_orchestration/02_CORE_COMPONENTS_DETAILED.md
- Section: “Component 4: Adaptive LLM Architecture”
- Note when HTTP adapter is preferred
4. Success Criteria
- File:
/home/home/p/g/n/dspex/docs/specs/dspex_cognitive_orchestration/06_SUCCESS_CRITERIA.md
- Section: “Stage 6: Adaptive LLM Architecture”
- HTTP adapter selection scenarios
5. Requirements
- File:
/home/home/p/g/n/dspex/docs/specs/immediate_implementation/REQUIREMENTS.md
- NFR.1: Performance requirements (<50ms overhead)
- IR.2: LLM provider integration patterns
Implementation Requirements
Adapter Structure
defmodule DSPex.LLM.Adapters.HTTP do
@behaviour DSPex.LLM.Adapter
defstruct [
:base_url,
:headers,
:timeout,
:pool_config,
:retry_config,
:format
]
# Supported formats
@formats [:openai, :anthropic, :google, :custom]
end
Provider Configurations
# OpenAI format
%{
base_url: "https://api.openai.com/v1",
endpoint: "/chat/completions",
headers: %{
"Authorization" => "Bearer #{api_key}",
"Content-Type" => "application/json"
},
format: :openai
}
# Anthropic format
%{
base_url: "https://api.anthropic.com/v1",
endpoint: "/messages",
headers: %{
"x-api-key" => api_key,
"anthropic-version" => "2023-06-01"
},
format: :anthropic
}
# Custom format
%{
base_url: "http://localhost:8000",
endpoint: "/generate",
format: :custom,
request_builder: &build_custom_request/2,
response_parser: &parse_custom_response/1
}
Connection Pooling
# Using Finch for connection pooling
def start_link(config) do
pool_config = [
size: config[:pool_size] || 10,
count: config[:pool_count] || 2,
protocol: :http2,
conn_opts: [
timeout: config[:connect_timeout] || 5_000
]
]
Finch.start_link(
name: __MODULE__,
pools: %{
default: pool_config
}
)
end
Request Building
defp build_request(:openai, prompt, opts) do
%{
model: opts[:model] || "gpt-3.5-turbo",
messages: format_messages(prompt, opts),
temperature: opts[:temperature] || 0.7,
max_tokens: opts[:max_tokens],
stream: opts[:stream] || false
}
end
defp build_request(:anthropic, prompt, opts) do
%{
model: opts[:model] || "claude-3-sonnet",
messages: format_messages(prompt, opts),
max_tokens: opts[:max_tokens] || 1024
}
end
Streaming Support
def stream(adapter, prompt, opts) do
request = build_streaming_request(adapter, prompt, opts)
Stream.resource(
fn -> start_streaming(adapter, request) end,
fn conn -> receive_chunk(conn) end,
fn conn -> cleanup_connection(conn) end
)
end
defp receive_chunk(conn) do
case Finch.stream_next(conn) do
{:ok, chunk} -> {[parse_sse_chunk(chunk)], conn}
:done -> {:halt, conn}
{:error, reason} -> raise "Streaming error: #{inspect(reason)}"
end
end
Acceptance Criteria
- Implements all adapter protocol functions
- Supports OpenAI, Anthropic, and Google formats
- Allows custom format configuration
- Connection pooling with Finch or similar
- Streaming support for compatible endpoints
- Retry logic with exponential backoff
- Request/response logging (configurable)
- Performance: <50ms overhead for simple requests
- Timeout handling with clear errors
- Rate limiting awareness
Error Handling
case Finch.request(request, __MODULE__) do
{:ok, %{status: 200, body: body}} ->
parse_response(adapter.format, body)
{:ok, %{status: 429, headers: headers}} ->
retry_after = get_retry_after(headers)
{:error, {:rate_limited, retry_after}}
{:ok, %{status: status, body: body}} ->
{:error, {:api_error, status, parse_error(body)}}
{:error, %Mint.TransportError{reason: :timeout}} ->
{:error, :timeout}
{:error, reason} ->
{:error, {:connection_error, reason}}
end
Testing Requirements
Create tests in:
test/dspex/llm/adapters/http_test.exs
Test scenarios:
- Successful completion requests
- Streaming responses
- Various error conditions (timeout, rate limit, API errors)
- Connection pool behavior
- Retry logic
- Different provider formats
Use Bypass or similar for HTTP mocking.
Example Usage
# Simple completion
adapter = %DSPex.LLM.Adapters.HTTP{
base_url: "https://api.openai.com/v1",
headers: %{"Authorization" => "Bearer sk-..."},
format: :openai
}
{:ok, response} = DSPex.LLM.Adapter.generate(
adapter,
"What is the capital of France?",
model: "gpt-3.5-turbo",
max_tokens: 50
)
# Streaming
{:ok, stream} = DSPex.LLM.Adapter.stream(
adapter,
"Tell me a story",
model: "gpt-4",
max_tokens: 1000
)
Enum.each(stream, fn chunk ->
IO.write(chunk)
end)
Dependencies
- Requires LLM.1 (Adapter Protocol) complete
- HTTP client library (Finch recommended)
- Jason for JSON parsing
- Bypass for testing (optional)
Time Estimate
6 hours total:
- 2 hours: Core HTTP implementation with pooling
- 1 hour: Provider format support
- 1 hour: Streaming implementation
- 1 hour: Error handling and retries
- 1 hour: Comprehensive testing
Notes
- Optimize for low latency
- Consider caching DNS lookups
- Implement proper SSL/TLS configuration
- Add telemetry for request tracking
- Consider circuit breaker for reliability
- Log requests/responses for debugging (with PII filtering)