CLAUDE CODE Implementation Prompts
Purpose: Self-contained prompts for completing remaining OTP hardening phases
Context: Foundation Jido System OTP compliance implementation
Date: 2025-07-02
PHASE 3 PROMPT: MonitorManager System Implementation
Context & Current State
You are implementing PHASE 3 of the Foundation Jido System OTP hardening project. PHASES 1 and 2 are COMPLETE:
- ✅ PHASE 1: SupervisedSend infrastructure with delivery monitoring and dead letter queue
- ✅ PHASE 2: All critical raw send() calls migrated to SupervisedSend
- 🔄 PHASE 3: YOUR TASK - Implement MonitorManager system with automatic cleanup and leak detection
Required Reading
MUST READ these files to understand context:
CLAUDE_CODE_worklog.md
- Implementation history and current statusCLAUDECODE.md
- Overall plan and phase definitionsAUDIT_02_planSteps_03.md
- Detailed Stage 3 requirements and MonitorManager specificationtest/OTP_ASYNC_APPROACHES_20250701.md
- Testing patterns (avoid Process.sleep, use test sync messages)
Problem Statement
The Foundation codebase has monitor leaks - Process.monitor() calls without proper cleanup leading to:
- Memory leaks from uncleaned monitor references
- Process mailbox pollution from orphaned DOWN messages
- Difficulty debugging monitor-related issues
- No visibility into active monitors
Implementation Requirements
Core MonitorManager Module
Location: lib/foundation/monitor_manager.ex
Features Required:
- Centralized Monitor Tracking: Track all monitors with metadata (caller, target, creation time, stack trace)
- Automatic Cleanup: Clean up monitors when either monitored process or caller process dies
- Leak Detection: Find monitors older than threshold that might be leaked
- Telemetry Integration: Emit events for monitor creation, cleanup, and leak detection
- Stats Interface: Provide statistics for monitoring and debugging
API Requirements:
# Client API
{:ok, ref} = MonitorManager.monitor(pid, :my_feature)
:ok = MonitorManager.demonitor(ref)
monitors = MonitorManager.list_monitors()
stats = MonitorManager.get_stats()
leaks = MonitorManager.find_leaks(age_ms \\ :timer.minutes(5))
Monitor Migration Helper
Location: lib/foundation/monitor_migration.ex
Features Required:
- Macro Helper:
monitor_with_cleanup/3
for automatic cleanup - GenServer Migration: Helper to migrate existing GenServer modules
- Code Analysis: Functions to identify and migrate Process.monitor calls
Integration Requirements
- Supervision: MonitorManager must be started under Foundation supervision tree
- Zero Production Overhead: No performance impact when not debugging
- Test Integration: Proper test patterns using sync messages (NOT Process.sleep)
- ETS Storage: Use ETS for fast lookups and concurrent access
Test Requirements
Location: test/foundation/monitor_manager_test.exs
Test Categories Required:
- Basic Operations: monitor/demonitor lifecycle
- Automatic Cleanup: Process death triggers cleanup
- Caller Cleanup: Monitor cleanup when caller dies
- Leak Detection: Find old monitors
- Statistics: Track creation/cleanup counts
- Concurrency: Multiple processes creating monitors
CRITICAL: Use proper OTP async patterns from test/OTP_ASYNC_APPROACHES_20250701.md
:
- Test sync messages (Approach #1) for async operation completion
- NO Process.sleep() calls
- Deterministic completion detection
Reference Implementation Patterns
Based on existing Foundation patterns:
- Use SupervisedSend for any inter-process communication
- Follow Foundation.DeadLetterQueue patterns for ETS storage and GenServer design
- Use Application.get_env(:foundation, :test_pid) for test notifications
- Follow existing telemetry patterns from SupervisedSend
Expected Deliverables
- ✅
lib/foundation/monitor_manager.ex
- Core monitor management - ✅
lib/foundation/monitor_migration.ex
- Migration helpers - ✅
test/foundation/monitor_manager_test.exs
- Comprehensive test suite - ✅ All tests passing with zero warnings
- ✅ Integration with Foundation supervision tree
- ✅ Documentation and examples
Success Criteria
- MonitorManager tracks all monitors with full metadata
- Automatic cleanup prevents monitor leaks
- Comprehensive test coverage with proper OTP patterns
- Zero production performance overhead
- Clear debugging and monitoring capabilities
phase 3 in progress!
PHASE 4 PROMPT: Timeout Configuration & GenServer Call Migration
Context & Current State
You are implementing PHASE 4 of the Foundation Jido System OTP hardening project. PHASES 1, 2, and 3 are COMPLETE:
- ✅ PHASE 1: SupervisedSend infrastructure with delivery monitoring and dead letter queue
- ✅ PHASE 2: All critical raw send() calls migrated to SupervisedSend
- ✅ PHASE 3: MonitorManager system with automatic cleanup and leak detection
- 🔄 PHASE 4: YOUR TASK - Add timeout configuration and migrate all GenServer.call/2 to include timeouts
Required Reading
MUST READ these files to understand context:
CLAUDE_CODE_worklog.md
- Implementation history and phases 1-3 completionCLAUDECODE.md
- Overall plan and current phase statusAUDIT_02_planSteps_03.md
- Section 3.4 GenServer Timeout Enforcement requirementstest/OTP_ASYNC_APPROACHES_20250701.md
- Testing patterns for infrastructure
Problem Statement
The Foundation codebase has GenServer.call/2 without explicit timeouts leading to:
- Processes hanging indefinitely on dead/slow GenServers
- No consistent timeout policies across the application
- Difficult debugging of timeout-related issues
- Default 5-second timeout may be inappropriate for different operations
Implementation Requirements
Timeout Configuration Module
Location: lib/foundation/timeout_config.ex
Features Required:
- Centralized Configuration: Service-specific and operation-specific timeouts
- Environment Overrides: Runtime configuration via Application environment
- Pattern Matching: Support for operation type patterns
- Macro Helper: Easy timeout application in GenServer calls
Configuration Structure:
@timeout_config %{
# Service-specific timeouts
"Foundation.ResourceManager" => @long_timeout,
"Foundation.Services.ConnectionManager" => @long_timeout,
# Operation-specific timeouts
batch_operation: @long_timeout,
health_check: 1_000,
sync_operation: @default_timeout,
# Pattern-based timeouts
{:data_processing, :*} => @long_timeout,
{:network_call, :*} => @critical_timeout,
}
API Requirements:
timeout = TimeoutConfig.get_timeout(MyServer)
timeout = TimeoutConfig.get_timeout(:batch_operation)
timeout = TimeoutConfig.get_timeout({:data_processing, :etl})
# Macro for easy use
TimeoutConfig.call_with_timeout(server, request, opts \\ [])
Migration Script
Location: scripts/migrate_genserver_timeouts.exs
Features Required:
- Pattern Recognition: Find all GenServer.call/2 without timeouts
- Automatic Migration: Convert to GenServer.call/3 with appropriate timeouts
- Import Addition: Add TimeoutConfig imports where needed
- Report Generation: Summary of changes made
Current Violations
Find and fix GenServer.call/2 patterns in these areas:
- Foundation services and infrastructure
- Jido agents and coordination
- MABEAM coordination patterns
- Test files (use appropriate test timeouts)
Test Requirements
Location: test/foundation/timeout_config_test.exs
Test Categories Required:
- Configuration Loading: Service and operation timeouts
- Pattern Matching: Tuple pattern resolution
- Environment Overrides: Runtime configuration changes
- Macro Functionality: call_with_timeout behavior
- Default Handling: Fallback timeout behavior
Implementation Strategy
- Audit Current Usage: Find all GenServer.call/2 instances
- Categorize Operations: Group by service and operation type
- Define Timeout Policies: Appropriate timeouts for each category
- Create Migration Script: Automated conversion tool
- Test Integration: Ensure all changes work correctly
Reference Files
Current GenServer.call patterns to fix:
# Find current violations
grep -r "GenServer.call(" lib/ --include="*.ex" | grep -v ", [0-9]" | grep -v ":timer"
Expected Deliverables
- ✅
lib/foundation/timeout_config.ex
- Centralized timeout configuration - ✅
scripts/migrate_genserver_timeouts.exs
- Migration automation - ✅
test/foundation/timeout_config_test.exs
- Comprehensive test suite - ✅ All GenServer.call/2 converted to GenServer.call/3
- ✅ All tests passing with zero warnings
- ✅ Documentation and usage examples
Success Criteria
- All GenServer.call operations have explicit timeouts
- Consistent timeout policies across the application
- Easy configuration and runtime adjustment
- No hanging processes due to timeout issues
- Comprehensive test coverage
PHASE 5 PROMPT: Comprehensive Testing & Final OTP Compliance Verification
Context & Current State
You are implementing PHASE 5 of the Foundation Jido System OTP hardening project. PHASES 1-4 are COMPLETE:
- ✅ PHASE 1: SupervisedSend infrastructure with delivery monitoring and dead letter queue
- ✅ PHASE 2: All critical raw send() calls migrated to SupervisedSend
- ✅ PHASE 3: MonitorManager system with automatic cleanup and leak detection
- ✅ PHASE 4: Timeout configuration and all GenServer.call/2 migrated to include timeouts
- 🔄 PHASE 5: YOUR TASK - Comprehensive testing and final OTP compliance verification
Required Reading
MUST READ these files to understand context:
CLAUDE_CODE_worklog.md
- Complete implementation history of all phasesCLAUDECODE.md
- Overall plan and success criteriaAUDIT_02_planSteps_03.md
- Original audit findings and requirementstest/OTP_ASYNC_APPROACHES_20250701.md
- Testing methodology and patternsJULY_1_2025_PRE_PHASE_2_OTP_report_01_AUDIT_01.md
- Original OTP violations audit
Problem Statement
Complete final verification that all OTP violations have been eliminated and the system is production-ready:
- Verify zero raw send() calls to other processes
- Verify zero monitor leaks
- Verify all GenServer calls have timeouts
- Comprehensive load testing
- Integration testing across all components
- Performance regression verification
Implementation Requirements
Integration Test Suite
Location: test/foundation/otp_compliance_test.exs
Test Categories Required:
- End-to-End OTP Compliance: Full system behavior under load
- SupervisedSend Integration: Cross-component message delivery
- MonitorManager Integration: Monitor lifecycle across components
- Timeout Compliance: All timeouts working under stress
- Performance Regression: Verify no significant slowdown
Load Testing Suite
Location: test/load/otp_compliance_load_test.exs
Load Test Scenarios:
- High Message Volume: 1000+ messages/second through SupervisedSend
- Monitor Creation/Cleanup: 100+ concurrent monitor operations
- Timeout Stress: GenServer calls under high load
- Dead Letter Queue: Verify DLQ performance under failure conditions
- Memory Stability: No leaks under sustained load
Audit Verification Script
Location: scripts/otp_final_audit.exs
Verification Checks:
- Raw Send Audit: Confirm zero
send(pid, message)
to other processes - Monitor Audit: Confirm all
Process.monitor
use MonitorManager - Timeout Audit: Confirm all
GenServer.call/2
have been migrated - Code Quality: Zero warnings, zero Credo violations
- Test Coverage: Comprehensive coverage metrics
Final Verification Requirements
Code Quality Gates
All must pass before completion:
# All tests pass
mix test --cover
# Zero warnings
mix compile --warnings-as-errors
# Clean Dialyzer
mix dialyzer
# Zero Credo violations
mix credo --strict
# Clean format
mix format --check-formatted
Performance Benchmarks
Create benchmarks for:
- SupervisedSend vs Raw Send: Verify acceptable overhead
- MonitorManager vs Direct Monitor: Verify minimal overhead
- Timeout Config vs Hardcoded: Verify no performance loss
- End-to-End Latency: Measure cross-component communication
Integration Tests
Must Test:
- Coordinator → Agent Communication: Using SupervisedSend
- Signal Router → Handler Delivery: With failure handling
- Coordination Patterns Broadcasting: Multi-agent coordination
- Scheduler → Agent Delivery: Scheduled task execution
- Monitor Lifecycle: Creation, cleanup, leak detection
Test File Structure
test/
├── foundation/
│ ├── supervised_send_test.exs # ✅ Already complete
│ ├── dead_letter_queue_test.exs # ✅ Already complete
│ ├── monitor_manager_test.exs # ✅ From Phase 3
│ ├── timeout_config_test.exs # ✅ From Phase 4
│ └── otp_compliance_test.exs # 🔄 Your task
├── integration/
│ ├── cross_component_test.exs # 🔄 Your task
│ └── failure_scenarios_test.exs # 🔄 Your task
├── load/
│ └── otp_compliance_load_test.exs # 🔄 Your task
└── OTP_ASYNC_APPROACHES_20250701.md # ✅ Testing methodology
Reference Implementation Patterns
Based on existing Foundation test patterns:
- Use Test Sync Messages: Follow
test/OTP_ASYNC_APPROACHES_20250701.md
Approach #1 - Proper Setup/Teardown: Application.put_env for test configuration
- Deterministic Assertions: No timing races or Process.sleep
- Resource Cleanup: Proper test isolation and cleanup
- Comprehensive Coverage: Edge cases, error conditions, concurrency
Success Metrics
Functional Requirements ✅
- Zero raw send() calls to other processes (self-sends allowed)
- Zero monitor leaks under normal and stress conditions
- All GenServer.call operations have explicit timeouts
- All error conditions properly handled
- Comprehensive telemetry and logging
Performance Requirements ✅
- SupervisedSend overhead < 10% vs raw send
- MonitorManager overhead < 5% vs direct monitoring
- No memory leaks under sustained load
- Timeout configuration has zero runtime overhead
- Dead letter queue handles 1000+ failed messages efficiently
Quality Requirements ✅
- 281+ tests passing, 0 failures
- Zero compiler warnings
- Zero Dialyzer errors
- Zero Credo violations
95% test coverage on new infrastructure
Expected Deliverables
- ✅
test/foundation/otp_compliance_test.exs
- Integration test suite - ✅
test/integration/cross_component_test.exs
- Cross-component integration - ✅
test/integration/failure_scenarios_test.exs
- Failure handling tests - ✅
test/load/otp_compliance_load_test.exs
- Load and stress tests - ✅
scripts/otp_final_audit.exs
- Automated compliance verification - ✅ Performance benchmarks and reports
- ✅ Final compliance documentation
Final Success Criteria
MISSION COMPLETE when all criteria met:
✅ Zero OTP Violations
- No raw send() to other processes
- No monitor leaks
- No infinite timeouts
✅ Production Ready
- Comprehensive error handling
- Full observability (telemetry, logging)
- Performance acceptable
- Memory stable under load
✅ Test Coverage Complete
- All new infrastructure tested
- Integration scenarios covered
- Load testing completed
- Failure scenarios validated
✅ Code Quality Excellent
- Zero warnings/errors
- Clean Credo compliance
- Full Dialyzer type checking
- Comprehensive documentation
DELIVERABLE: Production-grade OTP-compliant Foundation Jido System ready for deployment.
General Instructions for All Phases
Working Method
- Read Required Files: Understand context before coding
- Follow Existing Patterns: Use Foundation SupervisedSend/DeadLetterQueue as examples
- Test-Driven Development: Write failing tests first
- Proper OTP Patterns: Use sync messages, not Process.sleep
- Update Worklog: Document progress in
CLAUDE_CODE_worklog.md
Quality Standards
- All tests must pass before completion
- Zero compiler warnings
- Follow existing Foundation code style
- Comprehensive test coverage
- Proper documentation
Success Verification
Each phase complete when:
- Implementation working correctly
- All tests passing
- No regressions in existing functionality
- Worklog updated with progress
- Todo list updated
Commit Only When All Quality Gates Pass