SLEEP FUNCTIONS REMOVAL REPORT
CRITICAL ISSUE IDENTIFIED
Found 16 Process.sleep() calls in production lib/ code that need immediate removal and proper concurrent implementation.
ROOT CAUSE ANALYSIS
The sleep functions were introduced as LAZY HACKS instead of proper:
- Process synchronization using GenServer calls/casts
- OTP supervision and process linking
- Message passing with receive blocks
- Timer-based scheduling with Process.send_after
- Proper async/await patterns
SLEEP FUNCTIONS TO FIX (ONE AT A TIME)
1. /lib/foundation/service_registry.ex:484
Current Code:
Process.sleep(10)
Problem: Blocking wait in service lifecycle management
Fix Strategy: Use GenServer synchronization with proper state management
Test to Run: mix test test/foundation/service_registry_test.exs
2. /lib/foundation/utils.ex:582
Current Code:
Process.sleep(delay)
Problem: Blocking delay in retry mechanism
Fix Strategy: Use Process.send_after with receive block for non-blocking delays
Test to Run: mix test test/foundation/utils_test.exs
3. /lib/foundation/process_registry.ex:769
Current Code:
Process.sleep(50)
Problem: Race condition hack in process cleanup
Fix Strategy: Use proper process monitoring with Process.monitor and receive
Test to Run: mix test test/foundation/process_registry_test.exs
4. /lib/foundation/process_registry.ex:780
Current Code:
Process.sleep(200)
Problem: Arbitrary wait for process termination
Fix Strategy: Use Process.exit with trap_exit and proper shutdown handling
Test to Run: mix test test/foundation/process_registry_test.exs
5. /lib/foundation/mabeam/load_balancer.ex:292
Current Code:
:test_mode_agent -> spawn(fn -> Process.sleep(:infinity) end)
Problem: Infinite sleep for test agents
Fix Strategy: Use GenServer with proper state management and test hooks
Test to Run: mix test test/foundation/mabeam/load_balancer_test.exs
6. /lib/foundation/infrastructure/pool_workers/http_worker.ex:313
Current Code:
Process.sleep(delay_ms)
Problem: Blocking HTTP retry delays
Fix Strategy: Use :timer.apply_after for non-blocking scheduled retries
Test to Run: mix test test/foundation/infrastructure/pool_workers/http_worker_test.exs
7-11. /lib/foundation/infrastructure/pool_workers/http_worker.ex (Lines 327,340,359,379,384)
Current Code:
Process.sleep(Enum.random(10..100))
Problem: Random blocking delays in HTTP operations
Fix Strategy: Use Process.send_after with random intervals and message handling
Test to Run: mix test test/foundation/infrastructure/pool_workers/http_worker_test.exs
12. /lib/foundation/mabeam/coordination.ex:231
Current Code:
Process.sleep(50)
Problem: Race condition hack in coordination protocol
Fix Strategy: Use proper GenServer synchronization and state machines
Test to Run: mix test test/foundation/mabeam/coordination_test.exs
13. /lib/foundation/mabeam/coordination.ex:4096
Current Code:
Process.sleep(total_delay)
Problem: Blocking delay in coordination timing
Fix Strategy: Use GenServer timer with handle_info/2 for scheduled operations
Test to Run: mix test test/foundation/mabeam/coordination_test.exs
14. /lib/foundation/beam/processes.ex:194
Current Code:
Process.sleep(50)
Problem: Blocking wait in process lifecycle management
Fix Strategy: Use Process.monitor with proper message handling
Test to Run: mix test test/foundation/beam/processes_test.exs
15. /lib/foundation/beam/processes.ex:209
Current Code:
Process.sleep(10)
Problem: Race condition hack in process startup
Fix Strategy: Use GenServer.start_link with proper initialization callbacks
Test to Run: mix test test/foundation/beam/processes_test.exs
IMPLEMENTATION APPROACH (TEST-DRIVEN)
For Each Sleep Function:
- READ the specific file and understand the context
- IDENTIFY why the sleep was used (race condition, timing, retry, etc.)
- IMPLEMENT proper concurrent solution:
- GenServer calls for synchronization
- Process.monitor for process lifecycle
- Process.send_after for scheduling
- receive blocks for message handling
- TEST the specific module immediately after changes
- VERIFY no regressions in related functionality
Proper Concurrency Patterns to Use:
Instead of Process.sleep for retries:
# BAD
Process.sleep(delay)
retry_operation()
# GOOD
Process.send_after(self(), :retry, delay)
receive do
:retry -> retry_operation()
end
Instead of Process.sleep for synchronization:
# BAD
Process.sleep(50) # hope process is ready
# GOOD
GenServer.call(process, :wait_ready, timeout)
Instead of Process.sleep for process cleanup:
# BAD
Process.exit(pid, :kill)
Process.sleep(100) # hope it's dead
# GOOD
ref = Process.monitor(pid)
Process.exit(pid, :shutdown)
receive do
{:DOWN, ^ref, :process, ^pid, _reason} -> :ok
after 5000 ->
Process.exit(pid, :kill)
end
EXPECTED OUTCOMES
- Zero Process.sleep calls in production lib/ code
- Proper OTP-compliant concurrent patterns
- Faster test execution (no arbitrary waits)
- More reliable process coordination
- Better error handling and fault tolerance
NOTES
- The /lib/foundation/utils.ex:399 sleep is in a doctest example and can remain
- All test files will be addressed separately after lib/ fixes are complete
- Each fix must be tested immediately to prevent regressions
- No shortcuts, no placeholders - proper implementation required