F2 052 RISK ASSESSMENT MITIGATION STRATEGIES gemini

Documentation for F2_052_RISK_ASSESSMENT_MITIGATION_STRATEGIES_gemini from the Foundation repository.

Foundation 2.0: Risk Assessment & Mitigation Strategies

Purpose: To proactively identify and plan for the potential downsides and failure modes of our architectural choices.

Risk Category	Risk Description	Probability	Impact	Mitigation Strategy
Horde	Eventual Consistency Race Conditions: Developer calls `start_singleton` and immediately tries to `lookup_singleton` from another node before the registry has synced.	High	Medium	1. Documentation: Clearly document the “Lifecycle of a Distributed Process” and the sync delay. 2. Facade Helper: Add an optional `wait_for_sync: true` argument to `start_singleton` that polls the registry until the process appears, with a timeout. 3. Encourage Event-Driven Patterns: Guide users to react to `:service_up` events on the PubSub channel rather than polling the registry.
libcluster	Split-Brain Clusters: A network partition causes two or more independent clusters to form, leading to duplicate singletons and inconsistent state.	Medium	High	1. Quorum Configuration: For strategies that support it (like Gossip), Foundation’s “Apprentice Mode” will automatically configure a quorum based on expected node count. `quorum_size = div(expected_nodes, 2) + 1`. 2. HealthMonitor Detection: The `HealthMonitor` will detect when `Node.list()` is smaller than the configured quorum and raise a critical alert. 3. Static Seeds: For production setups, recommend using a static list of seed nodes (e.g., Kubernetes headless service DNS) in addition to dynamic discovery to anchor the cluster.
Leaky Abstraction	*The Abstraction is too* Leaky / Confusing**: Developers don’t understand when to use the facade vs. the raw tool, leading to inconsistent or incorrect usage.	Medium	Medium	1. Crystal-Clear Documentation: The `IMPL_PATTS_BEST_PRACTICES.md` is key. Every facade function docstring must explain what it does, what it wraps, and when to bypass it. 2. Excellent Logging: As shown in the patterns, the facades will log exactly which underlying functions they are calling with which arguments at the `DEBUG` level. This makes tracing behavior trivial. 3. Credo Checks: We can write custom Credo checks that suggest using a `Foundation` facade when raw tool usage is detected for a common pattern.
Performance	Facade Overhead: The convenience wrappers add unacceptable latency or CPU overhead compared to direct tool usage.	Low	Low	1. Stateless Facades: Our facades are designed to be stateless modules, not `GenServer`s, minimizing per-call overhead. 2. Benchmarking: The `Testing Strategy` includes performance benchmarks that will explicitly compare facade calls vs. direct calls to ensure overhead is negligible (<5%).
Configuration	Auto-Detection Failure: The “Mortal Mode” auto-detection picks the wrong strategy for a given environment (e.g., chooses Gossip in a K8s environment where the API is firewalled).	Medium	High	1. Verbose Logging: The `ClusterConfig` module will log every step of its detection logic (`Detected K8s env vars`, `mdns_lite not found`, `Falling back to Gossip`). 2. Clear Override Path: Documentation will prominently feature the “Apprentice Mode” (`config :foundation, cluster: :kubernetes`) as the one-line fix for any auto-detection failure.