← Back to 20250617

146 PYDANTIC USE IN DSPY more

Documentation for 146_PYDANTIC_USE_IN_DSPY_more from the Ds ex repository.

Additional Complementary Integration Points

21. Prompt Engineering and Template System

  • Template Validation: Uses Pydantic models to validate prompt template parameters
  • Variable Substitution: Leverages Pydantic’s field system for template variable validation
  • Prompt Constraints: Uses Pydantic validators to enforce prompt formatting rules
class PromptTemplate(BaseModel):
    template: str = Field(..., min_length=1)
    variables: dict[str, Any] = Field(default_factory=dict)
    
    @model_validator(mode="after")
    def validate_template_variables(self):
        # Ensure all template variables are provided

22. Language Model Interface

  • LM Configuration: Uses Pydantic models for language model parameter validation
  • Response Parsing: Leverages Pydantic for structured response validation
  • Token Usage Tracking: Uses Pydantic models for usage statistics
class LMUsage(BaseModel):
    prompt_tokens: int = Field(ge=0)
    completion_tokens: int = Field(ge=0)
    total_tokens: int = Field(ge=0)
    cost: Optional[float] = Field(ge=0)

23. Retrieval System Integration

  • Document Schema: Uses Pydantic for document metadata validation
  • Query Validation: Leverages Pydantic for search query parameter validation
  • Result Formatting: Uses Pydantic models for standardizing retrieval results
class RetrievalResult(BaseModel):
    text: str
    score: float = Field(ge=0, le=1)
    metadata: dict[str, Any] = Field(default_factory=dict)
    source: Optional[str] = None

24. Evaluation Framework

  • Metric Definition: Uses Pydantic for metric configuration and validation
  • Result Aggregation: Leverages Pydantic models for evaluation result storage
  • Benchmark Configuration: Uses Pydantic for dataset and evaluation setup
class EvaluationResult(BaseModel):
    score: float
    predictions: list[dict[str, Any]]
    metadata: dict[str, Any] = Field(default_factory=dict)
    timestamp: datetime = Field(default_factory=datetime.now)

25. Fine-tuning and Training

  • Training Configuration: Uses Pydantic for hyperparameter validation
  • Data Format Validation: Leverages Pydantic for training data structure validation
  • Job Status Tracking: Uses Pydantic models for training job state management
class TrainingConfig(BaseModel):
    learning_rate: float = Field(gt=0, le=1)
    batch_size: int = Field(ge=1)
    epochs: int = Field(ge=1)
    validation_split: float = Field(ge=0, le=1, default=0.2)

26. Streaming and Real-time Processing

  • Stream Message Validation: Uses Pydantic for validating streaming message formats
  • Response Chunking: Leverages Pydantic for structured chunk validation
  • Status Updates: Uses Pydantic models for status message formatting
class StreamChunk(BaseModel):
    content: str
    chunk_type: Literal["text", "tool_call", "status"]
    metadata: dict[str, Any] = Field(default_factory=dict)
    is_final: bool = False

27. Multi-modal Support

  • Media Validation: Uses Pydantic validators for image/audio format validation
  • Encoding Standards: Leverages Pydantic for enforcing encoding requirements
  • Size Constraints: Uses Pydantic field constraints for media file size limits
class MediaFile(BaseModel):
    content: bytes
    mime_type: str = Field(regex=r'^(image|audio|video)/.+')
    size_bytes: int = Field(ge=1, le=10_000_000)  # 10MB limit
    
    @model_validator(mode="after")
    def validate_content_type(self):
        # Validate content matches mime_type

28. Plugin and Extension System

  • Plugin Configuration: Uses Pydantic for plugin parameter validation
  • Extension Registry: Leverages Pydantic for extension metadata validation
  • Hook Definition: Uses Pydantic models for defining plugin hooks
class PluginConfig(BaseModel):
    name: str = Field(min_length=1, max_length=100)
    version: str = Field(regex=r'^\d+\.\d+\.\d+$')
    dependencies: list[str] = Field(default_factory=list)
    config: dict[str, Any] = Field(default_factory=dict)

29. Caching and Persistence

  • Cache Entry Validation: Uses Pydantic for cache data structure validation
  • Serialization Keys: Leverages Pydantic for generating consistent cache keys
  • Data Integrity: Uses Pydantic validators for ensuring cache data consistency
class CacheEntry(BaseModel):
    key: str = Field(min_length=1)
    value: Any
    timestamp: datetime = Field(default_factory=datetime.now)
    ttl: Optional[int] = Field(ge=0)
    
    @model_validator(mode="after")
    def validate_expiry(self):
        # Validate TTL logic

30. Distributed Processing

  • Task Definition: Uses Pydantic for distributed task parameter validation
  • Worker Configuration: Leverages Pydantic for worker node configuration
  • Result Aggregation: Uses Pydantic models for collecting distributed results
class DistributedTask(BaseModel):
    task_id: str = Field(min_length=1)
    worker_id: str
    parameters: dict[str, Any]
    priority: int = Field(ge=0, le=10, default=5)
    timeout: int = Field(ge=1, default=300)

31. API Integration and Web Services

  • Request Validation: Uses Pydantic for API request body validation
  • Response Formatting: Leverages Pydantic for consistent API response structure
  • Authentication: Uses Pydantic models for credential validation
class APIRequest(BaseModel):
    signature: str
    inputs: dict[str, Any]
    config: Optional[dict[str, Any]] = None
    
    class Config:
        extra = "forbid"
        
class APIResponse(BaseModel):
    outputs: dict[str, Any]
    metadata: dict[str, Any] = Field(default_factory=dict)
    success: bool = True
    error: Optional[str] = None

32. Workflow and Pipeline Management

  • Pipeline Definition: Uses Pydantic for workflow step validation
  • Dependency Management: Leverages Pydantic for step dependency validation
  • Execution Context: Uses Pydantic models for pipeline execution state
class PipelineStep(BaseModel):
    name: str = Field(min_length=1)
    module: str
    parameters: dict[str, Any] = Field(default_factory=dict)
    dependencies: list[str] = Field(default_factory=list)
    
    @model_validator(mode="after")
    def validate_dependencies(self):
        # Ensure no circular dependencies

33. Monitoring and Observability

  • Metrics Collection: Uses Pydantic for metrics data structure validation
  • Log Entry Validation: Leverages Pydantic for structured logging
  • Performance Tracking: Uses Pydantic models for performance metric storage
class PerformanceMetric(BaseModel):
    metric_name: str
    value: float
    unit: str
    timestamp: datetime = Field(default_factory=datetime.now)
    tags: dict[str, str] = Field(default_factory=dict)

34. Security and Access Control

  • Permission Validation: Uses Pydantic for access control rule validation
  • Token Management: Leverages Pydantic for API token structure validation
  • Audit Logging: Uses Pydantic models for security event logging
class SecurityPolicy(BaseModel):
    resource: str
    actions: list[str]
    principals: list[str]
    conditions: dict[str, Any] = Field(default_factory=dict)
    
    @model_validator(mode="after")
    def validate_policy(self):
        # Validate security policy rules

35. Version Control and Migrations

  • Schema Versioning: Uses Pydantic for model version validation
  • Migration Scripts: Leverages Pydantic for migration parameter validation
  • Backward Compatibility: Uses Pydantic’s aliasing for field migration
class ModelVersion(BaseModel):
    version: str = Field(regex=r'^\d+\.\d+$')
    schema_hash: str = Field(min_length=32, max_length=64)
    migration_path: Optional[str] = None
    deprecated: bool = False

36. Testing and Quality Assurance

  • Test Case Definition: Uses Pydantic for test parameter validation
  • Mock Data Generation: Leverages Pydantic factories for test data creation
  • Assertion Validation: Uses Pydantic models for test result validation
class TestCase(BaseModel):
    test_name: str = Field(min_length=1)
    inputs: dict[str, Any]
    expected_outputs: dict[str, Any]
    tolerance: float = Field(ge=0, default=0.01)
    
    @model_validator(mode="after")
    def validate_test_consistency(self):
        # Ensure inputs/outputs are consistent

37. Resource Management

  • Resource Allocation: Uses Pydantic for resource requirement validation
  • Quota Management: Leverages Pydantic for usage limit enforcement
  • Cost Tracking: Uses Pydantic models for cost calculation and tracking
class ResourceQuota(BaseModel):
    cpu_cores: int = Field(ge=1, le=64)
    memory_gb: int = Field(ge=1, le=512)
    gpu_count: int = Field(ge=0, le=8)
    storage_gb: int = Field(ge=10, le=10000)
    monthly_budget: float = Field(ge=0)

38. Custom Validators and Constraints

  • Domain-Specific Validation: Custom Pydantic validators for DSPy-specific constraints
  • Cross-Field Validation: Uses @model_validator for complex validation logic
  • Dynamic Constraints: Runtime constraint generation using Pydantic
@model_validator(mode="after")
def validate_signature_consistency(self):
    # Ensure input/output field consistency
    if len(self.input_fields) == 0:
        raise ValueError("Signature must have at least one input field")
    if len(self.output_fields) == 0:
        raise ValueError("Signature must have at least one output field")
    return self

These additional integration points demonstrate how Pydantic serves as the backbone for DSPy’s data validation, serialization, and type safety across virtually every component of the framework, from low-level data structures to high-level workflow management.