Dialyzer Deep Architectural Analysis
Executive Summary
After reducing Dialyzer errors from 72 to 31 (57% reduction), we’ve encountered the core architectural misalignments in the V2 Pool implementation. The remaining 31 errors reveal fundamental design issues that require strategic architectural decisions.
Current Status: 31 Critical Errors Remaining
Error Categories Breakdown
- Unmatched Returns (8 errors) - Functions producing values that callers ignore
- Contract Supertype Errors (12 errors) - Type specifications too broad for actual implementations
- Pattern Match Coverage (4 errors) - Unreachable code patterns
- Extra Range/Missing Range (4 errors) - Type specifications mismatch with reality
- Guard Failures (2 errors) - Impossible guard conditions
- Function Call Failures (1 error) - Contract violations
Deep Architectural Issues Analysis
1. Error Structure Proliferation Problem
Core Issue: Multiple error structures (PoolErrorHandler.t()
, ErrorHandler.t()
) with overlapping responsibilities create type confusion.
Evidence:
error_recovery_orchestrator.ex:527
- Contract supertype forexecute_failover_recovery/2
- Different error types returned by different layers (DSPex.Adapters.ErrorHandler vs DSPex.PythonBridge.PoolErrorHandler)
Root Cause: The V2 Pool implementation introduced PoolErrorHandler
as a wrapper around base ErrorHandler
, but calling code expects different return types.
Architectural Decision Required:
- Option A: Unify error structures - Use single error type throughout
- Option B: Clear separation - Pool-specific vs adapter-specific errors with explicit conversion
- Option C: Polymorphic error handling - Common interface with different implementations
Recommendation: Option B - Clear separation with explicit conversion functions
2. Context Map Evolution Problem
Core Issue: Context maps have evolved organically, leading to optional field explosion and type safety loss.
Evidence:
- Multiple functions with generic
map()
context parameters - Circuit breaker calls failing due to context field mismatches
- Recovery orchestrator expecting specific fields that aren’t guaranteed
Root Cause: No formal context schema, leading to runtime discovery of required fields.
Architectural Decision Required:
- Option A: Formal context structs - Define specific context types per operation
- Option B: Context validation - Runtime validation with clear error messages
- Option C: Union types - Multiple context schemas with type guards
Recommendation: Option A - Formal context structs for type safety
3. Function Return Value Proliferation
Core Issue: Functions return increasingly complex nested tuples that callers struggle to handle.
Evidence:
worker_recovery.ex:160
-execute_recovery/3
returns 5+ different tuple structuressession_pool_v2.ex:289
- Missing return types inhandle_pool_error/2
- Multiple unmatched return values across error handling chains
Root Cause: Error handling complexity has outgrown simple tuple returns.
Architectural Decision Required:
- Option A: Result monad pattern - Consistent {:ok, result} | {:error, reason} with rich result types
- Option B: State machine returns - Explicit state transitions with metadata
- Option C: Tagged unions - Discriminated unions for different result categories
Recommendation: Option A - Result monad with structured result types
4. Alert Structure Rigidity Problem
Core Issue: add_to_queue
function expects all alert fields, but different alert types have different fields.
Evidence:
error_reporter.ex:408
- Circuit opened alerts missingerror_count
,error_rate
fieldserror_reporter.ex:493
- High error rate alerts missingcircuit
,open_count
fields- Contract violations on 3 different
add_to_queue
calls
Root Cause: Single function trying to handle multiple alert types with incompatible schemas.
Architectural Decision Required:
- Option A: Alert inheritance - Base alert type with specialized subtypes
- Option B: Protocol-based alerts - Common protocol implemented by different alert types
- Option C: Union alert types - Discriminated union with pattern matching
Recommendation: Option C - Union alert types with pattern matching
5. Recovery Strategy Type Safety Problem
Core Issue: Recovery strategies use atoms and generic maps, leading to runtime discovery of required fields.
Evidence:
error_recovery_orchestrator.ex:507
- Contract supertype forexecute_retry_recovery/2
- Generic
recovery_strategy()
type lacks field validation - Success typing reveals specific field requirements not captured in specs
Root Cause: Strategy pattern implementation using loose typing instead of formal interfaces.
Architectural Decision Required:
- Option A: Strategy structs - Define specific struct for each strategy type
- Option B: Behaviour protocols - Define recovery behaviour with callbacks
- Option C: Strategy registry - Runtime strategy lookup with validation
Recommendation: Option A - Strategy structs for compile-time validation
6. State Machine Integration Inconsistency
Core Issue: Worker state machine integration is inconsistent across the pool, leading to type mismatches.
Evidence:
- Pattern match coverage errors in
session_pool.ex:461
- Worker recovery returning different state machine structures
- Inconsistent worker state handling between enhanced and basic workers
Root Cause: Worker state machine was added incrementally without full integration.
Architectural Decision Required:
- Option A: Full state machine integration - All workers use state machine
- Option B: Optional state machine - Clear separation between enhanced/basic workers
- Option C: State machine abstraction - Common interface regardless of implementation
Recommendation: Option B - Clear separation with explicit worker types
Prioritized Fix Strategy
Phase 1: Critical Type Safety (High Priority)
Fix Alert Structure Rigidity
- Create union alert types for different alert categories
- Update
add_to_queue
to handle different alert schemas - Impact: Fixes 3 contract violation errors
Fix Function Return Proliferation
- Standardize recovery function returns using result monad pattern
- Update
execute_recovery/3
spec to match actual returns - Impact: Fixes 4 missing_range/extra_range errors
Fix Pattern Match Coverage
- Remove unreachable patterns in error categorization
- Update pattern guards to reflect actual type constraints
- Impact: Fixes 4 pattern_match_cov errors
Phase 2: Contract Precision (Medium Priority)
Fix Contract Supertypes
- Narrow function specs to match success typing
- Use specific return types instead of generic
term()
- Impact: Fixes 8 contract_supertype errors
Fix Guard Failures
- Remove impossible guard conditions
- Update context nil checks to handle actual map types
- Impact: Fixes 2 guard_fail errors
Phase 3: Return Value Handling (Low Priority)
- Fix Unmatched Returns
- Add proper pattern matching for all function calls
- Decide whether to handle or explicitly ignore return values
- Impact: Fixes 8 unmatched_return errors
Implementation Approach
Step 1: Create Union Alert Types
defmodule DSPex.PythonBridge.AlertTypes do
@type circuit_opened_alert :: %{
type: :circuit_opened,
message: binary(),
timestamp: integer(),
metadata: map(),
circuit: term()
}
@type high_error_rate_alert :: %{
type: :high_error_rate,
message: binary(),
timestamp: integer(),
metadata: map(),
error_count: non_neg_integer(),
error_rate: float(),
total_count: non_neg_integer()
}
@type alert :: circuit_opened_alert() | high_error_rate_alert() | multiple_circuits_alert()
end
Step 2: Standardize Recovery Returns
@type recovery_result ::
{:ok, :recovered, term()} |
{:ok, :failover, term()} |
{:retry, non_neg_integer()} |
{:remove, reason :: term(), metadata :: map()} |
{:error, reason :: term()}
Step 3: Create Formal Context Schemas
defmodule DSPex.PythonBridge.ContextSchemas do
@type pool_operation_context :: %{
worker_id: String.t(),
session_id: String.t() | nil,
operation: atom(),
attempt: non_neg_integer(),
adapter: module()
}
@type recovery_context :: %{
original_operation: function() | nil,
args: list() | map(),
user_facing: boolean()
}
end
Expected Outcomes
After Phase 1: ~20 errors remaining (35% additional reduction) After Phase 2: ~8 errors remaining (75% total reduction from original 72) After Phase 3: 0-3 errors remaining (96%+ total reduction)
Risk Assessment
Low Risk Fixes: Alert types, pattern coverage, guard failures
Medium Risk Fixes: Contract supertypes, return value standardization
High Risk Fixes: Major architectural changes (not recommended for current phase)
Success Metrics
- Dialyzer Error Count: Target <5 errors
- Type Safety: 100% compile-time type validation for error paths
- Code Maintainability: Clear error handling patterns
- Performance: No runtime overhead from type fixes
This analysis provides a roadmap for resolving the remaining architectural issues while maintaining system stability and improving long-term maintainability.