Of course. Let’s re-imagine this vision from the ground up, pivoting from the deep, complex integration of Partisan to a more pragmatic, flexible, and ecosystem-friendly approach centered on libcluster
and mdnslite
.
The new philosophy is not about replacing BEAM’s distribution but about mastering its orchestration. We will provide an incredibly simple “on-ramp” for clustering while allowing for deep customization, leveraging the best tools the community has already built.
Here is the revised architectural vision and battle plan.
Foundation 2.0: The Pragmatic Distribution Framework
Executive Summary
Foundation 2.0 will become the definitive framework for building clustered Elixir applications by embracing a layered, pragmatic approach to distribution. We will combine the power and flexibility of libcluster
with the zero-configuration simplicity of mdnslite
to create a framework that is both accessible to mortals and powerful enough for experts.
Our core innovation is not in replacing the BEAM’s transport layer, but in providing an opinionated, batteries-included orchestration layer on top of it. This makes clustering a solved problem for 80% of use cases out of the box, while offering escape hatches for the other 20% to configure any topology imaginable.
Strategic Shift: Composition over Replacement
The previous vision centered on replacing Distributed Erlang with Partisan. This new vision focuses on composing and mastering existing, battle-tested tools:
Aspect | Partisan-based Vision (Previous) | Libcluster-based Vision (New) |
---|---|---|
Core Tech | Partisan. Replaces Distributed Erlang transport. | libcluster . Orchestrates standard Distributed Erlang. |
Complexity | High. Requires deep understanding of a new network stack. | Low to High (Layered). Simple defaults, powerful overrides. |
Ecosystem | Niche. Bypasses a large part of the existing ecosystem. | Mainstream. Integrates with and leverages the broad toolset. |
“Channels” | Built-in at the transport layer. Eliminates HOL blocking. | Simulated at the application layer using Phoenix.PubSub . |
Flexibility | Limited to Partisan’s capabilities. | Nearly infinite via libcluster ’s pluggable strategies. |
On-ramp | Steep learning curve. | Gentle slope, from zero-config to full control. |
The Three Layers of Clustering
Foundation will offer three distinct, progressively powerful ways to configure clustering.
- Mortal Mode (Zero-Config): For local development and simple LAN-based deployments (e.g., Nerves).
- Apprentice Mode (Simple-Config): For common cloud deployments (e.g., Kubernetes, Fly.io) using simple, high-level keywords.
- Wizard Mode (Full-Config): For complex or bespoke scenarios, providing a direct pass-through to
libcluster
’s raw configuration.
This layered approach is our key to being both “batteries-included” and “ultimately configurable.”
Foundation 2.0 Unified Architecture (Libcluster Edition)
Configuration Deep Dive: Mortal, Apprentice, Wizard
This is the heart of the user experience.
1. Mortal Mode: Zero-Config Clustering
For the developer who just wants things to work on their local network.
User Configuration (config/config.exs
):
config :my_app, foundation: [
cluster: true
]
Foundation’s Internal Logic (Foundation.Distributed.Cluster
):
Foundation intelligently selects the best zero-config strategy.
- Check if
mdns_lite
is a dependency. If yes, use a customMdnsLite
strategy. - If not, fall back to
Cluster.Strategy.Gossip
on a default port. - This is perfect for Nerves devices or local developer machines to find each other automatically.
2. Apprentice Mode: Simple, Declarative Config
For the developer deploying to a standard environment like Kubernetes.
User Configuration (config/config.exs
):
config :my_app, foundation: [
cluster: [
strategy: :kubernetes,
app_name: "elixir-scope"
]
]
# Or for Fly.io / DNS-A-record-based clustering
config :my_app, foundation: [
cluster: [
strategy: :dns,
query: "top1.nearest.of.my-app.internal"
]
]
Foundation’s Internal Logic (Foundation.Distributed.Cluster
):
Foundation provides a simple schema and translates it into the verbose libcluster
topology.
# Internal translation for the :kubernetes example above
defp translate_config(:kubernetes, opts) do
app = opts[:app_name]
[
strategy: Cluster.Strategy.Kubernetes,
config: [
kubernetes_selector: "app=#{app}",
kubernetes_node_basename: "my_app" # Or derived from app name
]
]
end
This hides the complexity of libcluster
’s config keys while exposing the essential knobs.
3. Wizard Mode: Full libcluster
Control
For the power user with a complex, multi-topology setup.
User Configuration (config/config.exs
):
config :libcluster,
topologies: [
data_plane: [
strategy: Cluster.Strategy.Gossip,
config: [port: 45890, if_addr: {0,0,0,0}]
],
control_plane: [
strategy: Cluster.Strategy.Kubernetes,
config: [kubernetes_selector: "role=control"]
]
]
Foundation’s Internal Logic (Foundation.Distributed.Cluster
):
If config :libcluster, :topologies
is present, Foundation steps back and simply starts Cluster.Supervisor
with the user-provided config. It offers no magic but provides full power. This also makes migrating an existing libcluster
app to Foundation trivial.
Implementation Roadmap
Phase 1: Core Orchestration (Weeks 1-3)
Objective: Build the libcluster
wrapper and the three-tiered configuration system.
- [ ]
Foundation.Distributed.Cluster
: Create aGenServer
orSupervisor
that is responsible for startingCluster.Supervisor
. - [ ] Implement Configuration Logic:
Mortal Mode
: Detectmdns_lite
or default toGossip
.Apprentice Mode
: Build translators for:kubernetes
,:dns
, and:gossip
.Wizard Mode
: Add the bypass to useApplication.get_env(:libcluster, :topologies)
.
- [ ]
Cluster.Strategy.MdnsLite
(Custom Strategy):- Implement the
Cluster.Strategy
behaviour. - In
start_link
, useMdnsLite.subscribe/1
to listen for new hosts. - When a host is discovered, parse the
.local
name and useNode.connect/1
. - Periodically query
MdnsLite.Info.dump_caches()
to find nodes and compare withNode.list()
to handle disconnects.
- Implement the
Phase 2: Application-Layer Channels (Weeks 4-5)
Objective: Mitigate Distributed Erlang’s head-of-line blocking by providing application-level channels using Phoenix.PubSub
.
- [ ] Integrate
Phoenix.PubSub
: Add it as a dependency and start it within Foundation’s supervision tree. Use the:pg
adapter for distributed operation. - [ ]
Foundation.Distributed.PubSub
: Create a wrapper module. - [ ] Define Standard Channels:
Foundation.Distributed.PubSub.broadcast(:control, message)
Foundation.Distributed.PubSub.broadcast(:telemetry, message)
Foundation.Distributed.PubSub.broadcast(:events, message)
- [ ] Document the Pattern: Clearly explain that this is an application-level solution to HOL blocking. A large telemetry payload won’t block an urgent control message because they travel on different PubSub topics, even if they share the same underlying TCP connection.
Phase 3: Cluster-Aware Core Services (Weeks 6-8)
Objective: Integrate the new distribution layer into existing Foundation services.
- [ ]
Foundation.Config 2.0
:- On update, use
Foundation.Distributed.PubSub.broadcast(:config_update, {path, value})
. - Subscribing nodes receive the update and apply it locally.
- For consensus, implement a simple 2-phase commit over the
:control
channel.
- On update, use
- [ ]
Foundation.Events 2.0
:emit_distributed
simply becomes aPubSub.broadcast
on the:events
channel.- This is simpler and more robust than the Partisan model.
- [ ]
Foundation.ServiceRegistry 2.0
:- This is where we get opinionated. Instead of building our own from scratch, we recommend and provide seamless integration for
Horde
. Foundation.Registry.register(name, module)
will useHorde.Registry.register
under the hood.- This leverages another best-in-class, battle-tested library, fitting our composition philosophy.
- This is where we get opinionated. Instead of building our own from scratch, we recommend and provide seamless integration for
Phase 4: Intelligence and Polish (Weeks 9-12)
Objective: Add self-management features and prepare for launch.
- [ ]
Foundation.Intelligence.ClusterMonitor
:- A process that subscribes to
libcluster
’s:connected_nodes
and:disconnected_nodes
events. - Emits
:telemetry
events for cluster topology changes. - Can trigger healing logic, like re-balancing work via
Horde
.
- A process that subscribes to
- [ ] Documentation Overhaul:
- Create clear guides for Mortal, Apprentice, and Wizard modes.
- Write a guide on migrating from
libcluster
to Foundation. - Explain the
Phoenix.PubSub
channel pattern and its trade-offs.
- [ ] ElixirScope & DS_Ex Integration:
- Update the integration examples. Work distribution will now use
Horde.Supervisor
to distribute workers across the cluster. - Distributed debugging will use the
:control
PubSub channel to send commands.
- Update the integration examples. Work distribution will now use
Conclusion: The Best of All Worlds
This new vision for Foundation 2.0 is more pragmatic, robust, and aligned with the Elixir ecosystem. By standing on the shoulders of giants like libcluster
, Phoenix.PubSub
, and Horde
, we can focus our efforts on what truly matters: providing a seamless, productive, and powerful developer experience for building distributed applications.
We offer:
- For Mortals: A single line of config to get a working cluster.
- For Apprentices: Simple, intuitive keywords for common deployment targets.
- For Wizards: The full, untamed power of
libcluster
for any scenario imaginable.
This approach is less about revolution and more about masterful synthesis. It’s a framework that respects the existing ecosystem, leverages its strengths, and delivers unparalleled usability. This is the path to making Foundation the definitive choice for distributed Elixir.