JIDO Architectural Recovery Plan
Executive Assessment: SALVAGEABLE
Verdict: The architecture is NOT fundamentally broken - it’s fundamentally misaligned. The dialyzer violations indicate disconnected layers, not unsalvageable code.
Scope Reality Check:
- 13 implementation files (~2000 lines)
- 7 files with Foundation dependencies
- Core Jido framework appears solid
- Issues concentrated in integration layers
Recommendation: SYSTEMATIC RECONSTRUCTION over abandonment.
Root Cause Analysis
This isn’t “bad code” - it’s evolutionary architecture without migration:
- Phase 1: Pure Jido agents (working)
- Phase 2: Foundation integration layer added (partially working)
- Phase 3: Advanced coordination grafted on (broken)
- Phase 4: Tests patched to pass (lying)
The 200+ dialyzer violations represent accumulated technical debt from architectural evolution without refactoring.
Fixability Assessment
✅ FIXABLE ISSUES (80% of violations)
- Phantom Dependencies - Replace or implement missing modules
- Type Contract Lies - Align specs with reality
- Bridge Result Mismatch - Fix return values
- Macro Hygiene - Fix
__MODULE__
usage - Queue Type Confusion - Consistent opaque types
⚠️ REQUIRES REDESIGN (20% of violations)
- Coordination Result Propagation - Simplify expectations
- Multi-agent Workflow - Accept current limitations
- Circuit Breaker Integration - Build or remove
Systematic Fix Strategy
Phase 1: Foundation Reality Check (2-3 hours)
Goal: Establish what Foundation modules actually exist and work
# Audit existing Foundation modules
find lib/foundation -name "*.ex" -exec grep -l "defmodule Foundation" {} \;
# Create missing critical modules or remove references
Actions:
- Foundation.Telemetry: Either implement or replace with
:telemetry
- Foundation.CircuitBreaker: Either implement or remove protection
- Foundation.Cache: Either implement or remove caching
- Registry functions: Fix missing functions or use alternatives
Success Criteria: No more “Function does not exist” errors
Phase 2: Bridge Contract Honesty (3-4 hours)
Goal: Make Bridge.delegate_task return what coordinators expect
Current (broken):
def delegate_task(delegator, delegate, task) do
send(delegate, {:mabeam_task, task.id, task})
:ok # <-- Missing result propagation!
end
Fixed (honest):
def delegate_task(delegator, delegate, task, opts \\ []) do
timeout = Keyword.get(opts, :timeout, 5000)
case GenServer.call(delegate, {:execute_task, task}, timeout) do
{:ok, result} -> {:ok, result}
error -> {:error, :delegation_failed}
end
rescue
_e -> {:error, :delegation_failed}
end
Success Criteria: CoordinatorAgent patterns can match Bridge results
Phase 3: Type System Truth (2-3 hours)
Goal: Align all @spec with actual function behavior
Strategy:
- Run dialyzer on each file individually
- Update specs to match “Success typing” dialyzer reports
- Remove “extra types” that functions can never return
- Add missing error cases functions actually return
Example Fix:
# Before (lying):
@spec process_task(pid(), map(), keyword()) :: {:ok, result} | {:error, reason}
# After (honest):
@spec process_task(pid(), map(), keyword()) :: {:error, :server_not_found}
Success Criteria: Zero “Type specification” warnings
Phase 4: Macro Hygiene Fix (1 hour)
Goal: Fix get_default_capabilities/0
in FoundationAgent
Current (broken):
defp get_default_capabilities() do
case __MODULE__ do # Always the using module!
JidoSystem.Agents.TaskAgent -> [:task_processing] # Never matches
...
end
end
Fixed (working):
defmacro __using__(opts) do
quote do
# ... existing code ...
defp get_default_capabilities() do
unquote(opts[:capabilities] || [:general_purpose])
end
end
end
Success Criteria: Each agent gets correct capabilities
Phase 5: Queue Type Consistency (1 hour)
Goal: Ensure all queue operations use proper opaque types
Actions:
- Initialize all queues with
:queue.new()
- Never treat queues as plain tuples
- Add type guards where needed
Success Criteria: No “opaque type mismatch” errors
Phase 6: Dead Code Removal (1-2 hours)
Goal: Remove unreachable code paths and unused functions
Strategy:
- Remove unused aliases
- Delete functions dialyzer marks as “never called”
- Remove impossible pattern matches
- Simplify error handling to what’s actually possible
Success Criteria: Zero “unused” and “unreachable” warnings
Testing Strategy During Fix
Phase-by-Phase Verification
- After each phase: Run dialyzer on affected files
- Every 2 phases: Run full test suite
- Expect breakage: Tests will fail as we fix lies
- Fix tests: Update to verify actual behavior
Test Reconstruction Priorities
- Keep integration tests - They verify real behavior
- Fix unit tests - They often test wrong assumptions
- Add type tests - Verify dialyzer fixes work
- Remove impossible tests - Tests for unreachable code
Expected Timeline
Total Effort: 10-15 hours over 3-4 days
- Day 1: Phases 1-2 (Foundation + Bridge fixes)
- Day 2: Phases 3-4 (Types + Macros)
- Day 3: Phases 5-6 (Cleanup + Testing)
- Day 4: Integration verification
Risk Assessment
LOW RISK:
- Foundation module fixes (well-scoped)
- Type specification updates (non-breaking)
- Dead code removal (by definition safe)
MEDIUM RISK:
- Bridge contract changes (may break coordinators)
- Macro fixes (affects all agents)
HIGH RISK:
- None identified - scope is contained
Alternative: “Stable Broken” Approach
If systematic fixes seem too ambitious:
Minimal Viability Path (2-3 hours)
- Silence dialyzer with strategic
@dialyzer
annotations - Document known issues in module docs
- Add defensive guards in critical paths
- Keep current test behavior (even if wrong)
This creates a “stable broken” system useful for study but not production.
Recommendation
GO FOR SYSTEMATIC RECONSTRUCTION
Rationale:
- Scope is manageable (13 files, focused issues)
- Root causes are architectural, not implementation
- Fixes will create genuinely robust system
- Alternative is perpetual technical debt
The dialyzer violations are a GIFT - they show exactly what needs fixing without guesswork.
Success Definition
Phase 1 Success: mix dialyzer
runs clean (zero violations)
Phase 2 Success: Tests pass and verify correct behavior
Phase 3 Success: System is architecturally honest and reliable
This is absolutely achievable given the concentrated scope of issues.