NATS.io networking fundamentals in Bacalhau
Starting from the v.1.3.0
to communicate with other nodes on the network Bacalhau uses NATS.io, a powerful open-source messaging system designed to streamline communication across complex network environments.
Our initial NATS integration focuses on simplifying communication between orchestrator and compute nodes. By embedding NATS within orchestrators, we streamline the network. Now, compute nodes need only connect to one or a few orchestrators and dynamically discover others at runtime, dramatically cutting down on configuration complexity.
Easier Setup: Compute nodes no longer need to be directly accessible by orchestrators, removing deployment barriers in diverse environments (on-premises, edge locations, etc.).
Increased Reliability: Network changes are less disruptive, as compute nodes can easily switch between orchestrators if needed.
Future-Proof: This sets the stage for more advanced NATS features like global clusters and multi-orchestrator setups.
The aim of integrating NATS into Bacalhau is to keep user experience with Bacalhau's HTTP APIs and CLIs for job submission and queries consistent. This ensures a smooth transition, allowing you to continue your work without any disruptions.
Start by creating a secure token. This token will be used for authentication between the orchestrator and compute nodes during their communications:
Make sure to securely store this token and share it only with authorized parties in your network.
With the authentication token set, launch your orchestrator node as follows:
This command sets up an orchestrator node with an embedded NATS server, using the given auth token to secure communications. It defaults to port 4222, but you can customize this using the --network-port flag if needed.
Compute nodes can authenticate using one of the following methods, depending on your preferred configuration setup:
authsecret
from the Config:This method assumes the authsecret
is already configured in the Bacalhau settings on the compute node, allowing for a seamless authentication process.
authsecret
Directly in the Orchestrator URI:Here, the authsecret
is directly included in the command line, providing an alternative for instances where it's preferable to specify the token explicitly rather than rely on the configuration file.
Both methods ensure that compute nodes, acting as NATS clients, securely authenticate with the orchestrator node(s), establishing a trusted communication channel within your Bacalhau network.
We're committed to providing a secure and flexible distributed computing environment. Future Bacalhau versions will expand authentication choices, including TLS certificates and JWT, catering to varied security needs and further strengthening network security.
Global Connectivity and Scalability: NATS opens avenues for Bacalhau to operate smoothly across all scales, from local deployments to international networks. Its self-healing capabilities and dynamic mesh networking form the foundation for a future of resilient and flexible distributed computing.
Unlocking New Possibilities: The integration heralds a new era of possibilities for Bacalhau, from global clusters to multiple orchestrator nodes, tackling the complexities of distributed computing with innovation and community collaboration.
The shift to NATS is a step toward making distributed computing more accessible, resilient, and scalable. As we start this new chapter, we're excited to explore the advanced features NATS brings to Bacalhau and welcome our community to join us on this transformative journey.
If you’re interested in learning more about distributed computing and how it can benefit your work, there are several ways to connect with us. Visit our website, sign up to our bi-weekly office hour, join our Slack or send us a message.
By default, Bacalhau jobs do not have any access to the internet. This is to keep both compute providers and users safe from malicious activities.
However, by using data volumes you can read and access your data from within jobs and write back results.
When you submit a Bacalhau job, you'll need to specify the internet locations to download data from and write results to. Both Docker and WebAssembly jobs support these features.
When submitting a Bacalhau job, you can specify the CID (Content IDentifier) or HTTP(S) URL to download data from. The data will be retrieved before the job starts and made available to the job as a directory on the filesystem. When running Bacalhau jobs, you can specify as many CIDs or URLs as needed using --input
which is accepted by both bacalhau docker run
and bacalhau wasm run
. See command line flags for more information.
You can write back results from your Bacalhau jobs to your public storage location. By default, jobs will write results to the storage provider using the --publisher
command line flag. See command line flags on how to configure this.
To use these features, the data to be downloaded has to be known before the job starts. For some workloads, the required data is computed as part of the job if the purpose of the job is to process web results. In these cases, networking may be possible during job execution.
To run Docker jobs on Bacalhau to access the internet, you'll need to specify one of the following:
full: unfiltered networking for any protocol --network=full
http: HTTP(S)-only networking to a specified list of domains --network=http
none: no networking at all, the default --network=none
Specifying none
will still allow Bacalhau to download and upload data before and after the job.
Jobs using http
must specify the domains they want to access when the job is submitted. When the job runs, only HTTP requests to those domains will be possible and data transfer will be rate limited to 10Mbit/sec in either direction to prevent ddos.
Jobs will be provided with http_proxy
and https_proxy
environment variables which contain a TCP address of an HTTP proxy to connect through. Most tools and libraries will use these environment variables by default. If not, they must be used by user code to configure HTTP proxy usage.
The required networking can be specified using the --network
flag. For http
networking, the required domains can be specified using the --domain
flag, multiple times for as many domains as required. Specifying a domain starting with a .
means that all sub-domains will be included. For example, specifying .example.com
will cover some.thing.example.com
as well as example.com
.
Bacalhau jobs are explicitly prevented from starting other Bacalhau jobs, even if a Bacalhau requester node is specified on the HTTP allowlist.
Bacalhau has support for describing jobs that can access the internet during job execution. The ability for compute nodes to run jobs that require internet access depends on what compute nodes are currently part of the network.
Compute nodes that join the Bacalhau network do not accept networked jobs by default (i.e. they only accept jobs that specify --network=none
, which is also the default).
The public compute nodes provided by the Bacalhau network will accept jobs that require HTTP networking as long as the domains are from this allowlist.
If you need to access a domain that isn't on the allowlist, you can make a request to the Bacalhau Project team to include your required domains. You can also set up your own compute node that implements the allowlist you need.