← Back to 20250617

145 PYDANTIC USE IN DSPY

Documentation for 145_PYDANTIC_USE_IN_DSPY from the Ds ex repository.

I’ll analyze the DSPy codebase to identify all the ways Pydantic is used throughout the framework. Let me examine the code systematically.Based on my analysis of the DSPy codebase, here’s a comprehensive list of all the ways Pydantic is used throughout the framework:

Core Framework Architecture

1. Signature System (Primary Usage)

  • Base Signature Class: Uses Pydantic’s BaseModel as the foundation for DSPy signatures
  • Field Definition: Uses FieldInfo from Pydantic for input/output field definitions
  • Metaclass Integration: SignatureMeta extends Pydantic’s metaclass to handle signature creation
  • Field Validation: Leverages Pydantic’s validation system for signature fields
  • JSON Schema Generation: Uses model_json_schema() for generating field schemas

2. Custom Types System

  • BaseType Class: Parent class for custom types (Image, Audio, etc.) inherits from pydantic.BaseModel
  • Model Serialization: Uses @pydantic.model_serializer() decorator for custom serialization
  • Model Validation: Uses @pydantic.model_validator(mode="before") for input validation
  • Field Annotation: Uses Pydantic’s annotation system for type checking

Specific Custom Type Implementations

3. Image Type

class Image(BaseType):
    url: str
    model_config = {"arbitrary_types_allowed": True}
    
    @pydantic.model_validator(mode="before")
    @classmethod
    def validate_input(cls, values):
        # Custom validation logic

4. Audio Type

class Audio(BaseType):
    data: str
    audio_format: str
    model_config = {"arbitrary_types_allowed": True}
    
    @pydantic.model_validator(mode="before")
    @classmethod
    def validate_input(cls, values: Any) -> Any:
        # Audio-specific validation

5. Tool System

  • Tool Class: Inherits from BaseType (which inherits from BaseModel)
  • Dynamic Model Creation: Uses pydantic.create_model() for runtime model generation
  • Schema Resolution: Uses Pydantic’s JSON schema system for tool argument validation
  • Type Coercion: Leverages Pydantic’s type conversion capabilities

6. History Type

class History(pydantic.BaseModel):
    messages: list[dict[str, Any]]
    model_config = {"arbitrary_types_allowed": True}

Configuration and Validation

7. Field Definition System

  • InputField/OutputField: Built on top of Pydantic’s Field() function
  • Field Constraints: Uses Pydantic’s constraint system (min_length, max_length, etc.)
  • Field Metadata: Leverages json_schema_extra for DSPy-specific metadata

8. Structured Output Generation

  • JSON Schema Creation: Uses model_json_schema() for OpenAI structured outputs
  • Schema Enforcement: Ensures all objects have required fields for structured outputs
  • Dynamic Model Creation: Uses pydantic.create_model() for runtime signature creation

9. Teleprompter Configuration

class SynthesizerArguments(BaseModel):
    feedback_mode: Optional[str] = None
    num_example_for_feedback: Optional[int] = None
    # ... other configuration fields
    
    @model_validator(mode="after")
    def validate_feedback_mode(self):
        # Validation logic

Data Processing and Serialization

10. Model State Management

  • State Serialization: Uses model_dump() for converting models to dictionaries
  • State Loading: Uses model_validate() for reconstructing models from data
  • JSON Conversion: Leverages Pydantic’s JSON serialization capabilities

11. Example and Prediction Classes

  • Prediction Class: Inherits from Example, which uses Pydantic-like patterns
  • Data Validation: Uses Pydantic patterns for validating example data
  • Serialization: Uses Pydantic methods for converting to/from JSON

Advanced Usage Patterns

12. Runtime Type Creation

# Dynamic signature creation using Pydantic
def _get_structured_outputs_response_format(signature: SignatureMeta) -> type[pydantic.BaseModel]:
    fields = {}
    for field_name, field in signature.output_fields.items():
        default = field.default if hasattr(field, 'default') else ...
        fields[field_name] = (field.annotation, default)
    
    pydantic_model = pydantic.create_model(
        "StructuredOutput",
        **fields,
        __config__={"extra": "forbid"}
    )
    return pydantic_model

13. Schema Manipulation

  • Schema Processing: Uses model_json_schema() and manipulates the resulting schema
  • Reference Resolution: Handles $ref resolution in JSON schemas
  • Schema Validation: Ensures schemas comply with OpenAI’s structured output requirements

14. Type Adaptation

# Using TypeAdapter for complex type handling
pydantic_wrapper = create_model("Wrapper", value=(self.arg_types[k], ...))
parsed = pydantic_wrapper.model_validate({"value": v})

Integration Points

15. Adapter System

  • Field Formatting: Uses Pydantic models for formatting fields in adapters
  • Validation: Leverages Pydantic’s validation for adapter inputs/outputs
  • Serialization: Uses Pydantic serialization for fine-tuning data preparation

16. Error Handling

  • Custom Exceptions: Some error classes inherit from or work with Pydantic validation errors
  • Validation Context: Uses Pydantic’s validation context for error messages

17. Configuration Classes

class JSONFilter(BaseModel):
    answer: str = Field(description="The filter_query in valid JSON format")
    
    @classmethod
    def model_validate_json(cls, json_data: str, *, strict: bool | None = None):
        # Custom JSON validation logic

Utility Functions

18. Type Processing

  • Annotation Parsing: Uses Pydantic’s type system for parsing complex annotations
  • Type Validation: Leverages Pydantic’s type checking capabilities
  • Schema Generation: Uses Pydantic’s schema generation for various components

19. Data Transformation

def serialize_for_json(value: Any) -> Any:
    # Uses Pydantic model dumping capabilities
    if hasattr(value, 'model_dump'):
        return value.model_dump()

Configuration Management

20. Settings and Context

  • Some configuration classes use Pydantic BaseModel for structured configuration
  • Validation of configuration parameters using Pydantic validators
  • Type-safe configuration management

This comprehensive analysis shows that Pydantic is deeply integrated into DSPy’s architecture, serving as the foundation for:

  • The signature system (core abstraction)
  • Custom type definitions
  • Data validation and serialization
  • Runtime type creation
  • Configuration management
  • Integration with external APIs (especially OpenAI’s structured outputs)

The usage ranges from basic model definition to advanced features like custom validators, dynamic model creation, and complex schema manipulation.