DSPEx + Elixact Integration: Testing Strategy & Validation Framework
Executive Summary
This document complements the comprehensive integration plan (153_copilot adjusted.md) by providing a detailed testing strategy specifically for the elixact integration, ensuring robust validation of all migration phases and feature implementations.
Testing Architecture Overview
Testing Pyramid for Elixact Integration
/\
/ \ Property-Based Tests (Schema validation invariants)
/____\
/ \ Integration Tests (Full pipeline validation)
/________\
Unit Tests (Individual component validation)
Testing Levels:
- Unit Tests: Each elixact-enhanced module in isolation
- Integration Tests: Cross-module interactions with elixact schemas
- Migration Tests: Backward compatibility and transition validation
- Property Tests: Schema invariant validation using PropCheck
- Performance Tests: Ensuring no regression with elixact overhead
- End-to-End Tests: Complete workflows with structured validation
Phase-Specific Testing Strategy
Phase 1: Foundation Testing (Week 1-2)
Schema Wrapper Validation
File: test/unit/dspex_schema_test.exs
defmodule DSPEx.SchemaTest do
use ExUnit.Case, async: true
use PropCheck
alias DSPEx.Schema
describe "DSPEx.Schema wrapper functionality" do
test "input_field/3 generates correct field metadata" do
# Test that input fields are properly tagged
end
test "output_field/3 generates correct field metadata" do
# Test that output fields are properly tagged
end
test "validate_inputs/1 validates only input fields" do
# Test partial validation of inputs only
end
test "validate_outputs/1 validates only output fields" do
# Test partial validation of outputs only
end
test "json_schema/0 generates complete JSON schema" do
# Test JSON schema generation includes all metadata
end
end
describe "backward compatibility" do
test "existing signature behavior is preserved" do
# Test that DSPEx.Schema works like existing signatures
end
test "field access patterns remain unchanged" do
# Test that field access still works
end
end
# Property-based tests for schema validation
property "all valid inputs pass validation" do
forall input_data <- valid_input_generator() do
# Generated valid inputs should always pass validation
end
end
property "invalid inputs are properly rejected" do
forall invalid_data <- invalid_input_generator() do
# Generated invalid inputs should fail with structured errors
end
end
end
Compatibility Layer Testing
File: test/unit/signature_compat_test.exs
defmodule DSPEx.SignatureCompatTest do
use ExUnit.Case, async: true
describe "string signature compatibility" do
test "simple signatures are converted correctly" do
# Test "question -> answer" conversion
end
test "complex signatures with multiple fields" do
# Test "context, question -> answer, confidence"
end
test "existing struct behavior is preserved" do
# Test that structs still work as expected
end
test "field access patterns unchanged" do
# Test that .field access still works
end
end
describe "migration compatibility" do
test "old signatures work with new predict/3" do
# Test mixed usage scenarios
end
test "validation errors are consistent" do
# Test error format compatibility
end
end
end
Phase 2: Core Integration Testing (Week 2-3)
Enhanced Predict Testing
File: test/unit/predict_elixact_test.exs
defmodule DSPEx.PredictElixactTest do
use ExUnit.Case, async: true
describe "elixact-enhanced prediction" do
test "structured input validation works" do
# Test validation before LLM call
end
test "structured output validation works" do
# Test validation after LLM response
end
test "validation errors are well-structured" do
# Test error format and content
end
test "performance is not significantly impacted" do
# Benchmark validation overhead
end
end
describe "mixed signature support" do
test "old signatures still work" do
# Test backward compatibility
end
test "new elixact signatures work" do
# Test new functionality
end
test "error handling is consistent" do
# Test error propagation
end
end
end
Custom Types Testing
File: test/unit/dspex_types_test.exs
defmodule DSPEx.TypesTest do
use ExUnit.Case, async: true
use PropCheck
alias DSPEx.Types.{ConfidenceScore, ReasoningChain, Image, Tool}
describe "ConfidenceScore type" do
test "validates range 0.0 to 1.0" do
# Test boundary validation
end
test "rejects invalid values" do
# Test validation failures
end
property "all valid confidence scores serialize correctly" do
forall score <- confidence_score_generator() do
# Test serialization/deserialization
end
end
end
describe "ReasoningChain type" do
test "validates step structure" do
# Test reasoning step validation
end
test "supports nested reasoning" do
# Test complex reasoning chains
end
end
describe "multi-modal types" do
test "Image type validates format and encoding" do
# Test image validation
end
test "Tool type validates schema and parameters" do
# Test tool calling structure
end
end
end
Phase 3: Adapter Integration Testing (Week 3-4)
Structured Output Testing
File: test/unit/adapter_structured_test.exs
defmodule DSPEx.AdapterStructuredTest do
use ExUnit.Case, async: true
describe "automatic JSON schema generation" do
test "simple schemas generate correctly" do
# Test basic field types
end
test "complex nested schemas work" do
# Test nested objects and arrays
end
test "custom types are handled" do
# Test ConfidenceScore, ReasoningChain, etc.
end
test "schema validation matches expectations" do
# Test that generated schemas validate correctly
end
end
describe "LLM provider compatibility" do
test "OpenAI structured outputs work" do
# Test OpenAI schema format
end
test "Anthropic tool calling works" do
# Test Anthropic schema format
end
test "Gemini structured generation works" do
# Test Gemini schema format
end
end
end
Phase 4: Advanced Features Testing (Week 4-5)
Teleprompter Integration Testing
File: test/integration/teleprompter_elixact_test.exs
defmodule DSPEx.TeleprompterElixactTest do
use ExUnit.Case, async: true
describe "SIMBA with elixact signatures" do
test "optimization works with structured validation" do
# Test SIMBA optimization with validation
end
test "examples are properly validated" do
# Test example validation during optimization
end
test "performance is maintained" do
# Benchmark optimization performance
end
end
describe "BootstrapFewShot with elixact" do
test "example generation includes validation" do
# Test validated example generation
end
test "bootstrap process handles validation errors" do
# Test error handling during bootstrap
end
end
end
Migration Validation Framework
Migration Test Suite
File: test/migration/migration_validation_test.exs
defmodule DSPEx.MigrationValidationTest do
use ExUnit.Case
@moduletag :migration
describe "phase-by-phase migration validation" do
test "Phase 1: Foundation components work" do
# Validate Phase 1 deliverables
end
test "Phase 2: Core integration is complete" do
# Validate Phase 2 deliverables
end
test "Phase 3: Adapters are enhanced" do
# Validate Phase 3 deliverables
end
# ... continue for all phases
end
describe "backward compatibility validation" do
test "all existing examples still work" do
# Run existing example files
end
test "existing test suite passes" do
# Validate no regressions
end
test "API surface is unchanged" do
# Test public API compatibility
end
end
end
Automated Migration Tools Testing
File: test/unit/migration_tools_test.exs
defmodule DSPEx.MigrationToolsTest do
use ExUnit.Case, async: true
alias DSPEx.Migration.{AnalysisTool, ConversionTool}
describe "signature analysis" do
test "detects all string-based signatures" do
# Test signature detection
end
test "identifies conversion candidates" do
# Test conversion analysis
end
test "estimates migration complexity" do
# Test complexity analysis
end
end
describe "automatic conversion" do
test "simple signatures convert correctly" do
# Test basic conversion
end
test "complex signatures preserve semantics" do
# Test advanced conversion
end
test "conversion is idempotent" do
# Test repeated conversion safety
end
end
end
Property-Based Testing Strategy
Schema Validation Properties
File: test/property/schema_validation_properties.exs
defmodule DSPEx.SchemaValidationProperties do
use ExUnit.Case
use PropCheck
property "valid data always passes validation" do
forall {schema_module, valid_data} <- schema_with_valid_data() do
{:ok, _validated} = schema_module.validate(valid_data)
end
end
property "invalid data always fails validation" do
forall {schema_module, invalid_data} <- schema_with_invalid_data() do
{:error, _errors} = schema_module.validate(invalid_data)
end
end
property "validation is deterministic" do
forall {schema_module, data} <- schema_with_data() do
result1 = schema_module.validate(data)
result2 = schema_module.validate(data)
result1 == result2
end
end
property "serialization roundtrip preserves data" do
forall {schema_module, valid_data} <- schema_with_valid_data() do
{:ok, validated} = schema_module.validate(valid_data)
dumped = schema_module.dump(validated)
{:ok, re_validated} = schema_module.validate(dumped)
validated == re_validated
end
end
end
JSON Schema Generation Properties
File: test/property/json_schema_properties.exs
defmodule DSPEx.JsonSchemaProperties do
use ExUnit.Case
use PropCheck
property "generated schemas are valid JSON schema" do
forall schema_module <- schema_module_generator() do
json_schema = schema_module.json_schema()
# Validate against JSON Schema meta-schema
is_valid_json_schema(json_schema)
end
end
property "generated schemas validate expected data" do
forall {schema_module, valid_data} <- schema_with_valid_data() do
json_schema = schema_module.json_schema()
# Test that valid data passes JSON schema validation
json_schema_validates(json_schema, valid_data)
end
end
end
Performance Benchmarking
Validation Overhead Benchmarks
File: test/benchmarks/validation_overhead_bench.exs
defmodule DSPEx.ValidationOverheadBench do
use Benchee
def run_benchmarks do
Benchee.run(
%{
"string_signature_validation" => fn {sig, data} ->
# Benchmark old validation
end,
"elixact_signature_validation" => fn {sig, data} ->
# Benchmark new validation
end,
"json_schema_generation" => fn sig ->
# Benchmark schema generation
end,
"structured_output_parsing" => fn {sig, response} ->
# Benchmark structured parsing
end
},
inputs: benchmark_inputs(),
memory_time: 2,
reduction_time: 2
)
end
defp benchmark_inputs do
%{
"simple_signature" => generate_simple_test_case(),
"complex_signature" => generate_complex_test_case(),
"nested_signature" => generate_nested_test_case()
}
end
end
Integration Test Scenarios
End-to-End Workflow Testing
File: test/integration/e2e_elixact_workflow_test.exs
defmodule DSPEx.E2EElixactWorkflowTest do
use ExUnit.Case
@moduletag :integration
describe "complete workflows with elixact" do
test "question answering with validation" do
# Full QA workflow with input/output validation
end
test "multi-step reasoning with structured output" do
# Complex reasoning workflow
end
test "teleprompter optimization with validation" do
# Full optimization cycle with validation
end
test "error recovery with structured errors" do
# Error handling throughout pipeline
end
end
describe "mixed signature workflows" do
test "old and new signatures work together" do
# Test mixed usage in single workflow
end
test "gradual migration scenarios" do
# Test partial migration scenarios
end
end
end
Test Configuration and Setup
Test Configuration
File: test/support/elixact_test_config.exs
defmodule DSPEx.ElixactTestConfig do
@doc """
Configuration for elixact integration tests
"""
def test_schemas do
[
simple_qa_schema(),
complex_reasoning_schema(),
multi_modal_schema(),
custom_types_schema()
]
end
def mock_llm_responses do
%{
valid_responses: valid_response_fixtures(),
invalid_responses: invalid_response_fixtures(),
edge_case_responses: edge_case_fixtures()
}
end
defp simple_qa_schema do
# Define test schema
end
# ... other test fixtures
end
Test Helpers
File: test/support/elixact_test_helpers.exs
defmodule DSPEx.ElixactTestHelpers do
@doc """
Helpers for elixact integration testing
"""
def create_test_signature(fields, opts \\ []) do
# Dynamic signature creation for testing
end
def generate_valid_data_for_schema(schema_module) do
# Generate valid test data
end
def generate_invalid_data_for_schema(schema_module) do
# Generate invalid test data
end
def assert_structured_error(error, expected_fields) do
# Assert error structure
end
def assert_validation_passes(schema, data) do
# Assert validation success
end
def assert_validation_fails(schema, data, expected_errors) do
# Assert validation failure with specific errors
end
end
Test Execution Strategy
Continuous Integration Pipeline
# Phase 1: Unit Tests
mix test test/unit/dspex_schema_test.exs
mix test test/unit/signature_compat_test.exs
# Phase 2: Integration Tests
mix test test/unit/predict_elixact_test.exs
mix test test/unit/dspex_types_test.exs
# Phase 3: Adapter Tests
mix test test/unit/adapter_structured_test.exs
# Phase 4: Advanced Features
mix test test/integration/teleprompter_elixact_test.exs
# Migration Validation
mix test test/migration/ --include migration
# Property Tests
mix test test/property/ --include property
# Performance Benchmarks
mix test test/benchmarks/ --include benchmark
# Full Integration
mix test --include integration
Test Coverage Targets
Component | Unit Test Coverage | Integration Coverage |
---|---|---|
DSPEx.Schema | 95% | 90% |
Signature Compatibility | 90% | 85% |
Enhanced Predict | 90% | 85% |
Custom Types | 95% | 80% |
Adapter Integration | 85% | 90% |
Migration Tools | 90% | 80% |
Risk Mitigation Through Testing
High-Risk Areas
- Performance Regression: Comprehensive benchmarking at each phase
- Backward Compatibility: Exhaustive compatibility test suite
- Data Loss: Property-based serialization roundtrip tests
- Integration Failures: End-to-end workflow validation
Test-Driven Migration
- Write Tests First: Create tests for expected behavior before implementation
- Red-Green-Refactor: Follow TDD cycle for each migration step
- Property Validation: Use property-based tests to catch edge cases
- Performance Monitoring: Continuous performance regression testing
Success Metrics
Quantitative Metrics
- ✅ All existing tests continue to pass (0 regressions)
- ✅ New test coverage >90% for elixact components
- ✅ Performance overhead <10% for validation
- ✅ Migration tool accuracy >95% for automatic conversion
Qualitative Metrics
- ✅ Developer experience is improved (IDE support, error messages)
- ✅ Schema definitions are more maintainable than string parsing
- ✅ Error handling is more structured and informative
- ✅ JSON schema generation works seamlessly with LLM APIs
Conclusion
This comprehensive testing strategy ensures that the elixact integration maintains the robustness and reliability that DSPEx users expect, while providing confidence in the migration process and new capabilities. The multi-layered testing approach catches issues early and provides clear validation of each integration phase.
The combination of unit tests, integration tests, property-based tests, and migration validation creates a safety net that allows for confident refactoring while maintaining backward compatibility and improving the overall developer experience.