OTP Async Testing Approaches - Performance & Correctness Analysis
Date: 2025-07-01
Context: Foundation SupervisedSend & DeadLetterQueue infrastructure testing
Problem: Testing asynchronous GenServer operations without Process.sleep()
anti-patterns
The Community Problem
Everyone recommends Task.async/await
(#4) because OTP async patterns are genuinely hard. But this creates a performance disaster when applied to hot-path infrastructure where microseconds matter.
Approach Comparison
1. Test Synchronization Messages ⭐ BEST FOR HOT PATH
# In GenServer
def handle_cast({:retry_messages, filter}, state) do
# ... do work ...
if test_mode?(), do: send(test_pid(), :retry_completed)
{:noreply, state}
end
# In test
GenServer.cast(DeadLetterQueue, {:retry_messages, :all})
assert_receive :retry_completed, 1000
Performance: ⭐⭐⭐⭐⭐ EXCELLENT - Zero overhead in production
Correctness: ⭐⭐⭐⭐⭐ PERFECT - Actual completion notification
Complexity: ⭐⭐⭐ MODERATE - Requires test-aware code
Maintenance: ⭐⭐⭐⭐ GOOD - Clean separation of concerns
Pros:
- Zero production overhead
- Deterministic completion detection
- No timing races
- Scales to any load
Cons:
- Requires test-mode detection in production code
- Slight code complexity increase
- Need to remember to add test notifications
Best For: Hot-path infrastructure, message queues, routing systems
2. Synchronous Test-Only Operations ⭐ BEST FOR SIMPLICITY
# Add test-only synchronous version
def retry_messages_sync(filter) do
GenServer.call(__MODULE__, {:retry_messages_sync, filter})
end
def handle_call({:retry_messages_sync, filter}, _from, state) do
result = do_retry_messages(filter, state)
{:reply, result, state}
end
# In test
result = DeadLetterQueue.retry_messages_sync(:all)
assert length(result.successful) > 0
Performance: ⭐⭐⭐⭐⭐ EXCELLENT - Zero production overhead
Correctness: ⭐⭐⭐⭐⭐ PERFECT - Synchronous completion guarantee
Complexity: ⭐⭐⭐⭐⭐ SIMPLE - Just add sync version
Maintenance: ⭐⭐⭐⭐⭐ EXCELLENT - Clean API separation
Pros:
- Zero production overhead
- Completely deterministic
- Clean test/prod API separation
- Simple to implement and understand
Cons:
- Requires maintaining parallel sync APIs
- Doesn’t test actual async behavior
- More code to maintain
Best For: Infrastructure with clear sync/async boundaries
3. State Polling with :sys.get_state/1 ⭐ MOST OTP-NATIVE
# In test
GenServer.cast(DeadLetterQueue, {:retry_messages, :all})
# Poll until state indicates completion
:timer.tc(fn ->
Stream.repeatedly(fn ->
state = :sys.get_state(DeadLetterQueue)
length(DeadLetterQueue.list_messages())
end)
|> Stream.take_while(fn message_count -> message_count > 0 end)
|> Enum.to_list()
end)
Performance: ⭐⭐⭐ ACCEPTABLE - Polling overhead only in tests
Correctness: ⭐⭐⭐⭐ VERY GOOD - Eventually consistent
Complexity: ⭐⭐ COMPLEX - Requires understanding polling patterns
Maintenance: ⭐⭐⭐ MODERATE - Can be flaky with timing
Pros:
- No production code changes
- Tests actual async behavior
- Uses standard OTP tools
- No test mode coupling
Cons:
- Polling creates CPU overhead
- Can be flaky on slow systems
- More complex test code
- Potential for infinite loops
Best For: Complex state machines, when you can’t modify production code
4. Task.async/await 💀 COMMUNITY FAVORITE (BUT WRONG FOR HOT PATH)
# In test - What everyone recommends
task = Task.async(fn ->
GenServer.cast(DeadLetterQueue, {:retry_messages, :all})
Process.sleep(50) # Still need sleep!
DeadLetterQueue.list_messages()
end)
messages = Task.await(task, 5000)
Performance: 💀 TERRIBLE - Spawns processes for every test operation
Correctness: ⭐⭐ POOR - Still relies on sleep timing
Complexity: ⭐⭐⭐⭐ SIMPLE - Easy to understand
Maintenance: ⭐⭐ POOR - Hides the real timing problem
Pros:
- Easy to understand
- Community-recommended pattern
- Timeout handling built-in
- No production code changes
Cons:
- SPAWNS A TASK FOR EVERY TEST OPERATION 💀
- Still requires Process.sleep internally
- Massive overhead for infrastructure testing
- False sense of solving the async problem
- Resource waste (processes, schedulers)
Why Everyone Recommends It: OTP async is hard, Task.async is easy to understand
Why It’s Wrong For Infrastructure: Hot-path code tested with Task.async becomes a performance disaster. If your infrastructure handles 10,000 messages/second, your tests spawn 10,000 tasks.
Performance Impact Analysis
Hot Path Infrastructure Characteristics
- Message Rate: 1,000-100,000 operations/second
- Latency Requirements: Sub-millisecond
- Test Frequency: Every change, CI/CD pipelines
- Memory Constraints: Must not leak under load
Task.async Impact on Infrastructure Tests
# Bad: Community recommendation for infrastructure
Enum.each(1..10_000, fn _ ->
task = Task.async(fn ->
SupervisedSend.send_supervised(pid, message)
Process.sleep(1) # Still needed!
end)
Task.await(task)
end)
# Result: 10,000 spawned tasks, massive scheduler overhead
Proper Approach for Infrastructure
# Good: Test synchronization
setup do
:ets.new(:test_completions, [:public, :named_table])
:ok
end
# In SupervisedSend
defp maybe_notify_test_completion(result) do
if Application.get_env(:foundation, :test_mode) do
:ets.insert(:test_completions, {self(), result})
end
end
# In test
Enum.each(1..10_000, fn _ ->
SupervisedSend.send_supervised(pid, message)
assert_receive :send_completed # Or poll ETS
end)
# Result: Zero additional processes, deterministic completion
Real-World Examples
❌ Wrong: Phoenix LiveView Testing (Common Mistake)
# What many do - spawns tasks for every DOM update
test "real-time updates" do
Enum.each(1..1000, fn i ->
task = Task.async(fn ->
send(live_view, {:update, i})
Process.sleep(10)
render(live_view) =~ "Item #{i}"
end)
assert Task.await(task)
end)
end
# Performance: Disaster
✅ Right: Phoenix LiveView Testing
# Proper approach - test what actually matters
test "real-time updates" do
Enum.each(1..1000, fn i ->
send(live_view, {:update, i})
assert_receive {:live_view_updated, ^i}, 100
end)
end
# Performance: Excellent, tests actual async behavior
Recommendations by Use Case
Hot-Path Infrastructure (SupervisedSend, Routers, Queues)
Use #1 (Test Sync Messages) - Zero production overhead, perfect correctness
Business Logic (User Operations, API Endpoints)
Use #2 (Sync Test APIs) - Clean separation, simple to understand
Complex State Machines (GenStateMachine, Multi-step Workflows)
Use #3 (State Polling) - When you can’t modify production code
Simple One-off Operations (File Processing, Email Sending)
Use #4 (Task.async) - Only when performance doesn’t matter
The Community Problem Explained
Why everyone says “use Task.async”:
- OTP async patterns ARE genuinely difficult
- Task.async is easy to understand
- It “works” for simple cases
- People don’t measure the performance impact
Why this hurts infrastructure:
- Infrastructure tests become slower than the actual code
- CI/CD pipelines waste resources on test overhead
- False confidence in async handling
- Hides real timing issues with Process.sleep
Conclusion
For Foundation SupervisedSend/DeadLetterQueue: Use approach #1 (test sync messages) because:
- This is hot-path infrastructure
- Performance is critical
- We control the production code
- Zero overhead is achievable
The Rule: If your code processes >1000 operations/second, Task.async in tests is an anti-pattern.
Author: Production OTP Infrastructure Analysis
Review: Critical path performance requirements
Next: Fix DeadLetterQueue test to use proper sync messages