DSPEx Testing Guide
This document provides comprehensive testing commands and strategies for DSPEx development following the 6-Stage Test-Driven Development methodology.
Test Structure
DSPEx follows a comprehensive testing strategy with five types of tests across six core modules, organized by our 6-stage implementation plan:
Test Types
- Unit Tests: Individual module functionality in isolation
- Integration Tests: Cross-module interactions
- Property Tests: Property-based testing with PropCheck
- Concurrent Tests: Race conditions and concurrent behavior
- End-to-End Tests: Complete workflows and benchmarks
Core Modules (6-Stage Implementation)
- Stage 1 - Foundation: DSPEx.Signature + DSPEx.Example
- Stage 2 - Client Layer: DSPEx.Client (GenServer with resilience)
- Stage 3 - Adapters: DSPEx.Adapter (LLM message formatting)
- Stage 4 - Prediction Engine: DSPEx.Program/Predict (execution)
- Stage 5 - Evaluation: DSPEx.Evaluate + DSPEx.Teleprompter (optimization)
- Stage 6 - Advanced Features: ChainOfThought, MultiChain, Parallel, RAG
Quick Start Commands
Install Dependencies
# Install all dependencies including test tools
mix deps.get
# Compile with warnings as errors for strict development
mix compile --warnings-as-errors
Basic Test Commands
# Run all tests
mix test
# Run tests with coverage report
mix test --cover
# Run specific test types
mix test test/unit/
mix test test/integration/
mix test test/property/
mix test test/concurrent/
mix test test/end_to_end/
# Run specific module tests
mix test test/unit/signature_test.exs
mix test test/unit/example_test.exs
6-Stage Progressive Development Workflow
Stage 1: Foundation ✅ COMPLETE - Signature System & Examples
# Core signature parsing and macro behavior (IMPLEMENTED)
mix test test/unit/signature_test.exs
# Example data structure operations (IMPLEMENTED)
mix test test/unit/example_test.exs
# Property-based tests for parser edge cases (IMPLEMENTED)
mix test test/property/signature_parser_test.exs
# Signature-example integration (IMPLEMENTED)
mix test test/integration/signature_example_test.exs
# Run all Stage 1 tests
mix test test/unit/signature_test.exs test/unit/example_test.exs test/property/signature_parser_test.exs test/integration/signature_example_test.exs
Stage 2: Client & HTTP Layer (NEXT) - Resilient API Communication
# Test client functionality with mocks
mix test test/unit/client_test.exs
# Test concurrent client behavior
mix test test/concurrent/client_concurrent_test.exs
# Test cache key properties
mix test test/property/cache_key_test.exs
# Run all Stage 2 tests (after implementation)
mix test test/unit/client_test.exs test/concurrent/client_concurrent_test.exs test/property/cache_key_test.exs
Stage 3: Translation Layer (Adapters) - Message Formatting
# Test adapter formatting and parsing
mix test test/unit/adapter_test.exs
# Test adapter roundtrip properties
mix test test/property/adapter_roundtrip_test.exs
# Test client-adapter integration
mix test test/integration/client_adapter_test.exs
# Test signature-adapter integration
mix test test/integration/signature_adapter_test.exs
# Run all Stage 3 tests
mix test test/unit/adapter_test.exs test/property/adapter_roundtrip_test.exs test/integration/client_adapter_test.exs test/integration/signature_adapter_test.exs
Stage 4: Prediction Engine - Core Execution
# Test program behavior and prediction
mix test test/unit/program_test.exs
mix test test/unit/predict_test.exs
# Test complete prediction pipeline
mix test test/integration/predict_pipeline_test.exs
# Test evaluate-predict integration
mix test test/integration/evaluate_predict_test.exs
# Run all Stage 4 tests
mix test test/unit/program_test.exs test/unit/predict_test.exs test/integration/predict_pipeline_test.exs test/integration/evaluate_predict_test.exs
Stage 5: Evaluation & Optimization - Concurrent Engine & Teleprompters
# Test evaluation logic
mix test test/unit/evaluate_test.exs
# Test optimization algorithms
mix test test/unit/teleprompter_test.exs
# Test concurrent evaluation
mix test test/concurrent/evaluate_concurrent_test.exs
# Test concurrent optimization
mix test test/concurrent/teleprompter_concurrent_test.exs
# Test full optimization workflow
mix test test/integration/teleprompter_full_test.exs
# Run all Stage 5 tests
mix test test/unit/evaluate_test.exs test/unit/teleprompter_test.exs test/concurrent/evaluate_concurrent_test.exs test/concurrent/teleprompter_concurrent_test.exs test/integration/teleprompter_full_test.exs
Stage 6: Advanced Features - ChainOfThought, MultiChain, Parallel, RAG
# Test advanced reasoning modules
mix test test/chain_of_thought_test.exs
mix test test/multi_chain_comparison_test.exs
mix test test/parallel_test.exs
mix test test/retriever_test.exs
# Test complete workflows
mix test test/end_to_end/complete_workflow_test.exs
# Run performance benchmarks
mix test test/end_to_end/benchmark_test.exs --include benchmark
# Run all Stage 6 tests
mix test test/chain_of_thought_test.exs test/multi_chain_comparison_test.exs test/parallel_test.exs test/retriever_test.exs test/end_to_end/
Advanced Testing Commands
Progressive Stage Testing
# Test current implementation status
mix test test/unit/signature_test.exs test/unit/example_test.exs # Stage 1 ✅
# Test stage dependencies (run after each stage completion)
mix test --only stage1
mix test --only stage2
mix test --only stage3
mix test --only stage4
mix test --only stage5
mix test --only stage6
Property-Based Testing
# Run property tests with more examples
mix test test/property/ --include property
# Run property tests with specific number of examples
PROPCHECK_NUMTESTS=1000 mix test test/property/
# Stage 1 property tests (IMPLEMENTED)
mix test test/property/signature_parser_test.exs
Concurrent and Load Testing
# Run concurrent tests
mix test test/concurrent/ --include concurrent
# Stress test with multiple processes
mix test test/concurrent/ --max-cases 10
Integration Testing by Stage
# Stage 1 Integration (IMPLEMENTED)
mix test test/integration/signature_example_test.exs
# Cross-stage integration (after Stage 2+)
mix test test/integration/client_adapter_test.exs
mix test test/integration/predict_pipeline_test.exs
mix test test/integration/teleprompter_full_test.exs
External Integration Testing
# Run tests requiring external APIs (skip in CI)
mix test --include external_api
# Run integration tests with real services
mix test --include integration
Performance and Benchmarking
# Run benchmark tests
mix test --include benchmark
# Run performance comparison tests
mix test --include comparison
# Monitor memory usage during tests
mix test --include benchmark --trace
Code Quality Commands
Static Analysis
# Run Dialyzer for type checking
mix dialyzer
# Run Credo for code style
mix credo
# Run both quality checks
mix credo && mix dialyzer
Documentation
# Generate documentation
mix docs
# Check documentation coverage
mix docs --formatter html --check-links
Formatting
# Format code
mix format
# Check if code is formatted
mix format --check-formatted
Test-Driven Development Workflow
Red-Green-Refactor Cycle
# 1. RED: Write failing test
mix test test/unit/new_feature_test.exs
# 2. GREEN: Implement minimal code to pass
mix test test/unit/new_feature_test.exs
# 3. REFACTOR: Clean up code
mix credo lib/dspex/new_feature.ex
mix test test/unit/new_feature_test.exs
Staged Module Development Order
- Start with unit tests for individual functions (Phase 1 ✅)
- Add property tests for edge cases and invariants (Phase 1 ✅)
- Write integration tests for module interactions (Phase 1 ✅)
- Add concurrent tests for GenServer modules (Phase 2+)
- Create end-to-end tests for complete workflows (Phase 6)
Current Implementation Status
✅ STABLE: Phase 1 - Minimal Working Pipeline (108 tests passing)
- DSPEx.Signature - Complete with macro-based parsing, field validation, behavior callbacks
- DSPEx.Example - Core data structure with Protocol implementations (Enumerable, Collectable, Inspect)
- DSPEx.Client - Basic HTTP client with validation and error handling (15 tests)
- DSPEx.Adapter - Message formatting and response parsing (17 tests)
- DSPEx.Predict - Orchestration layer connecting all components (22 tests)
- Property-based testing - Signature parser edge case validation
- Integration testing - Cross-module functionality
Phase 1 Test Commands:
# Run stable Phase 1 complete pipeline tests (excludes Phase 2+ features)
mix test test/unit/signature_test.exs test/unit/example_test.exs test/property/signature_parser_test.exs test/integration/signature_example_test.exs test/unit/client_test.exs test/unit/adapter_test.exs test/unit/predict_test.exs --exclude phase2_features
# Verify zero Dialyzer warnings
mix dialyzer --quiet
Phase 1 Success Criteria: ✅ ACHIEVED
- All 108 tests passing (54 foundation + 54 pipeline)
- Zero Dialyzer warnings maintained
- Complete end-to-end prediction pipeline working
- Proper error handling and validation throughout
- Ready for Phase 2 resilience features (circuit breakers, caching, etc.)
🚧 Phase 2+ Features (Deferred from Phase 1)
Tagged as :phase2_features
to exclude from Phase 1 testing:
- Dot notation field access (
example.question
) - requires custom Access behavior copy/2
,without/2
,to_map/1
- utility functions for data manipulationhas_field?/2
,equal?/2
- validation and comparison functionsget_field_type/2
- type introspection capabilitiesto_json/1
,from_json/1
- JSON serialization support
🎯 Next: Phase 2 - Client & HTTP Layer
Prerequisites: Phase 1 foundation must remain stable (54 tests passing)
- DSPEx.Client GenServer implementation with supervision
- HTTP client with Req integration and resilience patterns
- Circuit breaker with Fuse (5 failures → 10s cooldown)
- Caching with Cachex (deterministic key generation)
- Retry logic and timeout handling
Continuous Integration Commands
Full Test Suite
# Complete CI test suite
mix deps.get
mix compile --warnings-as-errors
mix format --check-formatted
mix credo --strict
mix test --cover
mix dialyzer
Stage-Specific CI
# Stage 1 CI (current)
mix test test/unit/signature_test.exs test/unit/example_test.exs test/property/signature_parser_test.exs test/integration/signature_example_test.exs --cover
# Progressive CI (add stages as completed)
mix test --only stage1,stage2 # After Stage 2
mix test --only stage1,stage2,stage3 # After Stage 3
# etc.
Coverage Analysis
# Generate detailed coverage report
mix test --cover --export-coverage default
mix test.coverage
Performance Regression Testing
# Run benchmarks and compare to baseline
mix test --include benchmark > current_benchmarks.txt
# Compare with previous results
Test Configuration
Environment Variables
# Configure test concurrency
export MIX_TEST_ASYNC=true
# Set property test iterations
export PROPCHECK_NUMTESTS=100
# Configure mock timeouts
export TEST_TIMEOUT=5000
Test Tags and Filtering
# Run only fast tests (exclude slow end-to-end)
mix test --exclude end_to_end
# Run only property tests
mix test --only property
# Skip external API tests
mix test --exclude external_api
# Run specific stage groups
mix test --only stage1 # Foundation (current)
mix test --only stage2 # Client layer
mix test --only stage3 # Adapters
mix test --only stage4 # Prediction engine
mix test --only stage5 # Evaluation & optimization
mix test --only stage6 # Advanced features
Debugging Test Failures
Verbose Output
# Run tests with detailed output
mix test --trace
# Run specific failing test with debug
mix test test/unit/failing_test.exs:42 --trace
Mock Debugging
# Verify mock expectations
mix test --trace | grep "expected.*called"
# Debug mock setup
mix test --trace | grep "Mox"
Property Test Debugging
# Get property test counterexamples
PROPCHECK_VERBOSE=true mix test test/property/
# Shrink failing property test cases
PROPCHECK_SHRINK=true mix test test/property/
Test Data Management
Example Datasets
# Generate test datasets
mix run scripts/generate_test_data.exs
# Validate test data quality
mix test test/test_data_validation.exs
Mock Management
# Verify all mocks are being used
mix test --warnings-as-errors
# Check for unused mocks
mix credo --only unused
Performance Monitoring
Memory Usage
# Monitor memory during tests
mix test --include benchmark --trace | grep memory
# Profile memory allocation
mix test --profile memory
Test Execution Time
# Time individual test suites
time mix test test/unit/
time mix test test/integration/
# Profile slow tests
mix test --slowest 10
This 6-stage testing strategy ensures comprehensive coverage while maintaining fast feedback loops during development. Stage 1 (Foundation) is complete and ready to support the actual DSPEx.Signature and DSPEx.Example implementations.