← Back to 20250617

154 testing integration validation

Documentation for 154_testing_integration_validation from the Ds ex repository.

DSPEx + Elixact Integration: Testing Strategy & Validation Framework

Executive Summary

This document complements the comprehensive integration plan (153_copilot adjusted.md) by providing a detailed testing strategy specifically for the elixact integration, ensuring robust validation of all migration phases and feature implementations.

Testing Architecture Overview

Testing Pyramid for Elixact Integration

    /\
   /  \  Property-Based Tests (Schema validation invariants)
  /____\
 /      \  Integration Tests (Full pipeline validation)
/________\
Unit Tests (Individual component validation)

Testing Levels:

  1. Unit Tests: Each elixact-enhanced module in isolation
  2. Integration Tests: Cross-module interactions with elixact schemas
  3. Migration Tests: Backward compatibility and transition validation
  4. Property Tests: Schema invariant validation using PropCheck
  5. Performance Tests: Ensuring no regression with elixact overhead
  6. End-to-End Tests: Complete workflows with structured validation

Phase-Specific Testing Strategy

Phase 1: Foundation Testing (Week 1-2)

Schema Wrapper Validation

File: test/unit/dspex_schema_test.exs

defmodule DSPEx.SchemaTest do
  use ExUnit.Case, async: true
  use PropCheck

  alias DSPEx.Schema

  describe "DSPEx.Schema wrapper functionality" do
    test "input_field/3 generates correct field metadata" do
      # Test that input fields are properly tagged
    end

    test "output_field/3 generates correct field metadata" do  
      # Test that output fields are properly tagged
    end

    test "validate_inputs/1 validates only input fields" do
      # Test partial validation of inputs only
    end

    test "validate_outputs/1 validates only output fields" do
      # Test partial validation of outputs only
    end

    test "json_schema/0 generates complete JSON schema" do
      # Test JSON schema generation includes all metadata
    end
  end

  describe "backward compatibility" do
    test "existing signature behavior is preserved" do
      # Test that DSPEx.Schema works like existing signatures
    end

    test "field access patterns remain unchanged" do
      # Test that field access still works
    end
  end

  # Property-based tests for schema validation
  property "all valid inputs pass validation" do
    forall input_data <- valid_input_generator() do
      # Generated valid inputs should always pass validation
    end
  end

  property "invalid inputs are properly rejected" do
    forall invalid_data <- invalid_input_generator() do
      # Generated invalid inputs should fail with structured errors
    end
  end
end

Compatibility Layer Testing

File: test/unit/signature_compat_test.exs

defmodule DSPEx.SignatureCompatTest do
  use ExUnit.Case, async: true

  describe "string signature compatibility" do
    test "simple signatures are converted correctly" do
      # Test "question -> answer" conversion
    end

    test "complex signatures with multiple fields" do
      # Test "context, question -> answer, confidence"
    end

    test "existing struct behavior is preserved" do
      # Test that structs still work as expected
    end

    test "field access patterns unchanged" do
      # Test that .field access still works
    end
  end

  describe "migration compatibility" do
    test "old signatures work with new predict/3" do
      # Test mixed usage scenarios
    end

    test "validation errors are consistent" do
      # Test error format compatibility
    end
  end
end

Phase 2: Core Integration Testing (Week 2-3)

Enhanced Predict Testing

File: test/unit/predict_elixact_test.exs

defmodule DSPEx.PredictElixactTest do
  use ExUnit.Case, async: true

  describe "elixact-enhanced prediction" do
    test "structured input validation works" do
      # Test validation before LLM call
    end

    test "structured output validation works" do
      # Test validation after LLM response
    end

    test "validation errors are well-structured" do
      # Test error format and content
    end

    test "performance is not significantly impacted" do
      # Benchmark validation overhead
    end
  end

  describe "mixed signature support" do
    test "old signatures still work" do
      # Test backward compatibility
    end

    test "new elixact signatures work" do
      # Test new functionality
    end

    test "error handling is consistent" do
      # Test error propagation
    end
  end
end

Custom Types Testing

File: test/unit/dspex_types_test.exs

defmodule DSPEx.TypesTest do
  use ExUnit.Case, async: true
  use PropCheck

  alias DSPEx.Types.{ConfidenceScore, ReasoningChain, Image, Tool}

  describe "ConfidenceScore type" do
    test "validates range 0.0 to 1.0" do
      # Test boundary validation
    end

    test "rejects invalid values" do
      # Test validation failures
    end

    property "all valid confidence scores serialize correctly" do
      forall score <- confidence_score_generator() do
        # Test serialization/deserialization
      end
    end
  end

  describe "ReasoningChain type" do
    test "validates step structure" do
      # Test reasoning step validation
    end

    test "supports nested reasoning" do
      # Test complex reasoning chains
    end
  end

  describe "multi-modal types" do
    test "Image type validates format and encoding" do
      # Test image validation
    end

    test "Tool type validates schema and parameters" do
      # Test tool calling structure
    end
  end
end

Phase 3: Adapter Integration Testing (Week 3-4)

Structured Output Testing

File: test/unit/adapter_structured_test.exs

defmodule DSPEx.AdapterStructuredTest do
  use ExUnit.Case, async: true

  describe "automatic JSON schema generation" do
    test "simple schemas generate correctly" do
      # Test basic field types
    end

    test "complex nested schemas work" do
      # Test nested objects and arrays
    end

    test "custom types are handled" do
      # Test ConfidenceScore, ReasoningChain, etc.
    end

    test "schema validation matches expectations" do
      # Test that generated schemas validate correctly
    end
  end

  describe "LLM provider compatibility" do
    test "OpenAI structured outputs work" do
      # Test OpenAI schema format
    end

    test "Anthropic tool calling works" do
      # Test Anthropic schema format  
    end

    test "Gemini structured generation works" do
      # Test Gemini schema format
    end
  end
end

Phase 4: Advanced Features Testing (Week 4-5)

Teleprompter Integration Testing

File: test/integration/teleprompter_elixact_test.exs

defmodule DSPEx.TeleprompterElixactTest do
  use ExUnit.Case, async: true

  describe "SIMBA with elixact signatures" do
    test "optimization works with structured validation" do
      # Test SIMBA optimization with validation
    end

    test "examples are properly validated" do
      # Test example validation during optimization
    end

    test "performance is maintained" do
      # Benchmark optimization performance
    end
  end

  describe "BootstrapFewShot with elixact" do
    test "example generation includes validation" do
      # Test validated example generation
    end

    test "bootstrap process handles validation errors" do
      # Test error handling during bootstrap
    end
  end
end

Migration Validation Framework

Migration Test Suite

File: test/migration/migration_validation_test.exs

defmodule DSPEx.MigrationValidationTest do
  use ExUnit.Case

  @moduletag :migration

  describe "phase-by-phase migration validation" do
    test "Phase 1: Foundation components work" do
      # Validate Phase 1 deliverables
    end

    test "Phase 2: Core integration is complete" do
      # Validate Phase 2 deliverables  
    end

    test "Phase 3: Adapters are enhanced" do
      # Validate Phase 3 deliverables
    end

    # ... continue for all phases
  end

  describe "backward compatibility validation" do
    test "all existing examples still work" do
      # Run existing example files
    end

    test "existing test suite passes" do
      # Validate no regressions
    end

    test "API surface is unchanged" do
      # Test public API compatibility
    end
  end
end

Automated Migration Tools Testing

File: test/unit/migration_tools_test.exs

defmodule DSPEx.MigrationToolsTest do
  use ExUnit.Case, async: true

  alias DSPEx.Migration.{AnalysisTool, ConversionTool}

  describe "signature analysis" do
    test "detects all string-based signatures" do
      # Test signature detection
    end

    test "identifies conversion candidates" do
      # Test conversion analysis
    end

    test "estimates migration complexity" do
      # Test complexity analysis
    end
  end

  describe "automatic conversion" do
    test "simple signatures convert correctly" do
      # Test basic conversion
    end

    test "complex signatures preserve semantics" do
      # Test advanced conversion
    end

    test "conversion is idempotent" do
      # Test repeated conversion safety
    end
  end
end

Property-Based Testing Strategy

Schema Validation Properties

File: test/property/schema_validation_properties.exs

defmodule DSPEx.SchemaValidationProperties do
  use ExUnit.Case
  use PropCheck

  property "valid data always passes validation" do
    forall {schema_module, valid_data} <- schema_with_valid_data() do
      {:ok, _validated} = schema_module.validate(valid_data)
    end
  end

  property "invalid data always fails validation" do
    forall {schema_module, invalid_data} <- schema_with_invalid_data() do
      {:error, _errors} = schema_module.validate(invalid_data)
    end
  end

  property "validation is deterministic" do
    forall {schema_module, data} <- schema_with_data() do
      result1 = schema_module.validate(data)
      result2 = schema_module.validate(data)
      result1 == result2
    end
  end

  property "serialization roundtrip preserves data" do
    forall {schema_module, valid_data} <- schema_with_valid_data() do
      {:ok, validated} = schema_module.validate(valid_data)
      dumped = schema_module.dump(validated)
      {:ok, re_validated} = schema_module.validate(dumped)
      validated == re_validated
    end
  end
end

JSON Schema Generation Properties

File: test/property/json_schema_properties.exs

defmodule DSPEx.JsonSchemaProperties do
  use ExUnit.Case
  use PropCheck

  property "generated schemas are valid JSON schema" do
    forall schema_module <- schema_module_generator() do
      json_schema = schema_module.json_schema()
      # Validate against JSON Schema meta-schema
      is_valid_json_schema(json_schema)
    end
  end

  property "generated schemas validate expected data" do
    forall {schema_module, valid_data} <- schema_with_valid_data() do
      json_schema = schema_module.json_schema()
      # Test that valid data passes JSON schema validation
      json_schema_validates(json_schema, valid_data)
    end
  end
end

Performance Benchmarking

Validation Overhead Benchmarks

File: test/benchmarks/validation_overhead_bench.exs

defmodule DSPEx.ValidationOverheadBench do
  use Benchee

  def run_benchmarks do
    Benchee.run(
      %{
        "string_signature_validation" => fn {sig, data} ->
          # Benchmark old validation
        end,
        "elixact_signature_validation" => fn {sig, data} ->
          # Benchmark new validation
        end,
        "json_schema_generation" => fn sig ->
          # Benchmark schema generation
        end,
        "structured_output_parsing" => fn {sig, response} ->
          # Benchmark structured parsing
        end
      },
      inputs: benchmark_inputs(),
      memory_time: 2,
      reduction_time: 2
    )
  end

  defp benchmark_inputs do
    %{
      "simple_signature" => generate_simple_test_case(),
      "complex_signature" => generate_complex_test_case(),
      "nested_signature" => generate_nested_test_case()
    }
  end
end

Integration Test Scenarios

End-to-End Workflow Testing

File: test/integration/e2e_elixact_workflow_test.exs

defmodule DSPEx.E2EElixactWorkflowTest do
  use ExUnit.Case

  @moduletag :integration

  describe "complete workflows with elixact" do
    test "question answering with validation" do
      # Full QA workflow with input/output validation
    end

    test "multi-step reasoning with structured output" do
      # Complex reasoning workflow
    end

    test "teleprompter optimization with validation" do
      # Full optimization cycle with validation
    end

    test "error recovery with structured errors" do
      # Error handling throughout pipeline
    end
  end

  describe "mixed signature workflows" do
    test "old and new signatures work together" do
      # Test mixed usage in single workflow
    end

    test "gradual migration scenarios" do
      # Test partial migration scenarios
    end
  end
end

Test Configuration and Setup

Test Configuration

File: test/support/elixact_test_config.exs

defmodule DSPEx.ElixactTestConfig do
  @doc """
  Configuration for elixact integration tests
  """
  
  def test_schemas do
    [
      simple_qa_schema(),
      complex_reasoning_schema(), 
      multi_modal_schema(),
      custom_types_schema()
    ]
  end

  def mock_llm_responses do
    %{
      valid_responses: valid_response_fixtures(),
      invalid_responses: invalid_response_fixtures(),
      edge_case_responses: edge_case_fixtures()
    }
  end

  defp simple_qa_schema do
    # Define test schema
  end

  # ... other test fixtures
end

Test Helpers

File: test/support/elixact_test_helpers.exs

defmodule DSPEx.ElixactTestHelpers do
  @doc """
  Helpers for elixact integration testing
  """

  def create_test_signature(fields, opts \\ []) do
    # Dynamic signature creation for testing
  end

  def generate_valid_data_for_schema(schema_module) do
    # Generate valid test data
  end

  def generate_invalid_data_for_schema(schema_module) do
    # Generate invalid test data
  end

  def assert_structured_error(error, expected_fields) do
    # Assert error structure
  end

  def assert_validation_passes(schema, data) do
    # Assert validation success
  end

  def assert_validation_fails(schema, data, expected_errors) do
    # Assert validation failure with specific errors
  end
end

Test Execution Strategy

Continuous Integration Pipeline

# Phase 1: Unit Tests
mix test test/unit/dspex_schema_test.exs
mix test test/unit/signature_compat_test.exs

# Phase 2: Integration Tests  
mix test test/unit/predict_elixact_test.exs
mix test test/unit/dspex_types_test.exs

# Phase 3: Adapter Tests
mix test test/unit/adapter_structured_test.exs

# Phase 4: Advanced Features
mix test test/integration/teleprompter_elixact_test.exs

# Migration Validation
mix test test/migration/ --include migration

# Property Tests
mix test test/property/ --include property

# Performance Benchmarks
mix test test/benchmarks/ --include benchmark

# Full Integration
mix test --include integration

Test Coverage Targets

ComponentUnit Test CoverageIntegration Coverage
DSPEx.Schema95%90%
Signature Compatibility90%85%
Enhanced Predict90%85%
Custom Types95%80%
Adapter Integration85%90%
Migration Tools90%80%

Risk Mitigation Through Testing

High-Risk Areas

  1. Performance Regression: Comprehensive benchmarking at each phase
  2. Backward Compatibility: Exhaustive compatibility test suite
  3. Data Loss: Property-based serialization roundtrip tests
  4. Integration Failures: End-to-end workflow validation

Test-Driven Migration

  1. Write Tests First: Create tests for expected behavior before implementation
  2. Red-Green-Refactor: Follow TDD cycle for each migration step
  3. Property Validation: Use property-based tests to catch edge cases
  4. Performance Monitoring: Continuous performance regression testing

Success Metrics

Quantitative Metrics

  • ✅ All existing tests continue to pass (0 regressions)
  • ✅ New test coverage >90% for elixact components
  • ✅ Performance overhead <10% for validation
  • ✅ Migration tool accuracy >95% for automatic conversion

Qualitative Metrics

  • ✅ Developer experience is improved (IDE support, error messages)
  • ✅ Schema definitions are more maintainable than string parsing
  • ✅ Error handling is more structured and informative
  • ✅ JSON schema generation works seamlessly with LLM APIs

Conclusion

This comprehensive testing strategy ensures that the elixact integration maintains the robustness and reliability that DSPEx users expect, while providing confidence in the migration process and new capabilities. The multi-layered testing approach catches issues early and provides clear validation of each integration phase.

The combination of unit tests, integration tests, property-based tests, and migration validation creates a safety net that allows for confident refactoring while maintaining backward compatibility and improving the overall developer experience.