Of course. The next logical step is to perform a detailed technical extraction. We will scrutinize the provided codebase and documentation to identify specific implementation details, implicit logic, and external dependencies that are crucial for a successful port. This process will also surface key unanswered questions that must be resolved before implementation can begin.
The output is a technical planning document intended for the engineering team tasked with building DSPEx
.
Technical Information Extraction & Unanswered Questions for the DSPEx
Port
Objective
This document details the specific technical information required to implement DSPEx
. It is based on an analysis of the provided DSPy source code (repomix-output.xml
) and documentation. The goal is to move from high-level architecture to a concrete implementation checklist, highlighting areas that need further investigation.
Phase 1: Foundational Core
Component / Feature | Required Information / Decision | Source of Truth (in provided files) |
---|---|---|
DSPEx.Signature (defsignature Macro) | 1. Exact Type Parsing Logic: How does Python’s _parse_type_node handle complex nested types (e.g., list[dict[str, str]] ), Literal , and Optional ? We must replicate the AST parsing logic to ensure compatibility.2. Custom Type Registration: dspy.Image and dspy.History are special Pydantic models. How does the system discover and use them? We need a mechanism for users of DSPEx to register their own custom structs (e.g., %MyApp.Image{} ) that the macro can resolve from type strings.3. Field Attribute Mapping: InputField and OutputField accept desc and prefix . How are these precisely used by the Adapter layer to construct the final prompt? We need to map this to struct metadata. | signatures/signature.py (_parse_type_node ), signatures/field.py , adapters/types/* |
DSPEx.Client.LM | 1. litellm Payload Schema: What is the exact JSON structure dspy.LM passes to litellm for both chat and text completion models? This includes keys like messages , model , temperature , max_tokens , n , stream , etc. This schema must be replicated for our Tesla -based HTTP client.2. Text vs. Chat Model Formatting: The Python code has distinct logic for 'text' models (litellm_text_completion ). It appears to concatenate user messages and append "BEGIN RESPONSE:" . We must confirm and replicate this specific formatting rule.3. Streaming Chunk Format: What is the data structure of a single chunk yielded by litellm when stream=True ? Is it a simple string, or a JSON object with a delta? This is critical for implementing DSPEx streaming. | clients/lm.py (e.g., litellm_completion ), pyproject.toml (shows litellm dependency) |
DSPEx.Adapter Layer | 1. Prompt Assembly Logic: The ChatAdapter assembles prompts with [[ ## field_name ## ]] . What is the exact order of assembly? The documentation suggests System Message -> Demos -> History -> Current Input. We need to trace the format method in adapters/base.py and adapters/chat_adapter.py to confirm the precise construction of the messages list.2. JSON Repair Logic: The JSONAdapter uses the json-repair library. What is the scope of repair it performs? Does it handle missing quotes, trailing commas, etc.? We will need to find or write an equivalent Elixir library.3. Structured Output Generation: JSONAdapter dynamically creates a Pydantic model (_get_structured_outputs_response_format ). How can we replicate this in Elixir? Decision Needed: Should we use a library like ExJsonSchema to validate, or dynamically define and compile a temporary module with a struct definition for Jason to decode into? | adapters/base.py , adapters/chat_adapter.py , adapters/json_adapter.py |
DSPEx.Evaluate | 1. Failure Handling: When a single evaluation in the thread pool fails, how is it handled? Does it contribute a score of 0, or is it ignored? The Evaluate class mentions a failure_score parameter, confirming we need to handle this. The ParallelExecutor also has a max_errors setting. We need to replicate this behavior. | evaluate/evaluate.py , utils/parallelizer.py |
Phase 2: Core Optimization Logic (Teleprompters)
Component / Feature | Required Information / Decision | Source of Truth (in provided files) |
---|---|---|
DSPEx.Teleprompter (General) | 1. Program State Management: Optimizers deepcopy student programs. What is the exact state that is copied versus reset? The reset_copy method is mentioned. We need to define the state boundaries for our GenServer -based optimizers. | primitives/module.py (deepcopy , reset_copy ), teleprompt/bootstrap.py |
BootstrapFewShot | 1. Trace Data Structure: The metric function receives a trace . What is its exact structure? It appears to be a list of tuples (predictor, inputs, outputs) . We need the precise schema of each element to build our own tracing mechanism (likely via a dedicated process or ETS table).2. Teacher/Student Logic: The docs mention that the teacher can be a different program. What are the rules for “structural equivalence” (assert_structural_equivalency ) that allow a teacher to be used with a student? | teleprompt/bootstrap_finetune.py (build_call_data_from_trace , assert_structural_equivalency ), teleprompt/bootstrap.py |
MIPROv2 (Bayesian Optimization) | 1. Critical Dependency - optuna : This optimizer relies heavily on the optuna library for Bayesian Optimization, specifically TPESampler . This is a major blocker. Decision Needed:a) Find a native Elixir library for Bayesian Optimization. (Unlikely to exist with the same features). b) Implement a simplified version natively. (High effort, high risk). c) Use Python interop via a Port to call an optuna script. This is the most pragmatic approach but introduces inter-process communication complexity.2. Hyperparameter Defaults: The auto="light" setting configures many hyperparameters. We need to extract the exact values for num_trials , minibatch_size , etc., from the _set_hyperparameters method. | teleprompt/mipro_optimizer_v2.py , pyproject.toml (shows optuna dependency) |
BootstrapFinetune | 1. Finetuning Data Format: What is the exact JSONL format generated by BootstrapFinetune ? The format_finetune_data method in the ChatAdapter and the _prepare_finetune_data function are the sources of truth. We must replicate this format precisely for our Python interop script.2. Provider-Specific Logic: The finetuning process for different providers (e.g., databricks , openai ) involves different API calls and data upload mechanisms. The DSPEx port will need a Provider behaviour with callbacks like upload_data/2 and start_training/3 that provider-specific modules will implement, likely by calling out to a Python script via a Port . | teleprompt/bootstrap_finetune.py , clients/databracks.py , clients/openai.py , adapters/chat_adapter.py |
Phase 3: Advanced Modules & Features
Component / Feature | Required Information / Decision | Source of Truth (in provided files) |
---|---|---|
CodeAct / ProgramOfThought (Code Sandbox) | 1. Sandbox IPC Protocol: The PythonInterpreter communicates with a Deno process (runner.js ) via stdin/stdout. We need the exact JSON schema for this communication. What does the request ({"code": "..."} ) contain? What are all possible response formats ({"output": ...} , {"error": ..., "errorType": ...} )?2. Tool Serialization: How are tools (Python functions) passed into the sandbox? The _inject_variables function seems to be the key. Does it serialize the function’s source code using inspect.getsource ? The documentation notes that CodeAct only accepts pure functions, which supports this hypothesis. This must be confirmed. | primitives/python_interpreter.py , primitives/runner.js , predict/code_act.py |
Assertions | 1. Dynamic Signature Modification: The backtrack_handler adds feedback and past_outputs fields to the signature. What are the exact contents and types of these fields? For past_outputs , is it a single value or a map of all output fields from the failed attempt? Replicating this requires knowing the precise data passed in the retry mechanism.2. Freezing Modules: The documentation mentions freezing modules by setting _compiled = True . How does this interact with assertions? Does a frozen module still backtrack? | primitives/assertions.py (commented out but contains logic), learn/programming/7-assertions.md , faqs.md |
Phase 4: Productionization & Developer Experience
Component / Feature | Required Information / Decision | Source of Truth (in provided files) |
---|---|---|
Saving & Loading | 1. cloudpickle Usage: What specific objects are being pickled? Is it the entire module instance, or just the state dictionary returned by dump_state ? The save method on BaseModule has a save_program flag which controls this. We need to know which path uses cloudpickle to replicate it with :erlang.term_to_binary .2. Dependency Versioning: The saving.py module stores dependency versions in the metadata. What is the full list of tracked dependencies? We should replicate this for ensuring compatibility. | primitives/module.py (save ), utils/saving.py |
Caching | 1. Cache Key Generation: The cache.py module’s cache_key function defines the key generation logic. It ignores certain arguments (api_key , base_url ). We must replicate this hashing algorithm and the list of ignored keys exactly if cross-language cache compatibility is desired (even if unlikely). | clients/cache.py |
Summary of Critical Blockers & Dependencies
- Bayesian Optimization (
optuna
):MIPROv2
is a flagship optimizer and is currently blocked by the lack of a native Elixir equivalent foroptuna
. Decision: Proceed with Python interop via aPort
as the initial strategy. - Finetuning (
transformers
,sglang
):BootstrapFinetune
and local model serving rely on the vast Python ML ecosystem. Decision: This will be handled entirely viaPort
-based Python interop. TheDSPEx
side will be responsible for data preparation and job orchestration. - Local Vector Search (
faiss
,ragatouille
): Similar to finetuning, local retrieval models are Python-native. Decision: Isolate these into RM client modules that usePort
interop. - Code Execution Sandbox (
deno
): TheCodeAct
module’s sandbox is a complex JS/Deno/Python system. Decision: Replicate the Inter-Process Communication (IPC) protocol via aPort
to a long-running Deno process.
The porting effort can proceed in parallel with the investigation of these blockers, but the designs for Phase 2 and 3 are fundamentally dependent on the interop strategy chosen.