1 of 13

Running Nodes

Node Onboarding

Introduction

This tutorial describes how to add new nodes to an existing private network. Two basic scenarios will be covered:

Adding a machine as a new node.
Adding a as a new node.

Pre-Prerequisites

You should have an established private network consisting of at least one requester node. See the guide to set one up.
You should have a new host (physical/virtual machine, cloud instance or docker container) with installed.

Add Host/Virtual Machine as a New Node

Let's assume that you already have a private network with at least one requester node. In this case, the process of adding new nodes follows the section. You will need to:

Set the token in the node.network.authsecret parameter
Execute bacalhau serve specifying the node type and orchestrator address via flags. You can find an example of such a command in the logs of the requester node, here is how it might look like:

Remember that in this example you need to replace all 127.0.0.1 and 0.0.0.0.0 addresses with the actual public IP address of your node.

Add a Cloud Instance as a New Node

To automate the process using Terraform follow these steps:

Determine the IP address of your requester node
Write a terraform script, which does the following:
1. Adds a new instance
2. Installs bacalhau on it
3. Launches a compute node
Execute the script

Support

GPU Installation

How to enable GPU support on your Bacalhau node

Bacalhau supports GPUs out of the box and defaults to allowing execution on all GPUs installed on the node.

Prerequisites

Bacalhau makes the assumption that you have installed all the necessary drivers and tools on your node host and have appropriately configured them for use by Docker.

In general for GPUs from any vendor, the Bacalhau client requires:

Nvidia

Verify installation by
nvidia-smi installed and functional

AMD

rocm-smi tool installed and functional

See the for guidance on how to run Docker workloads on AMD GPU.

Intel

xpu-smi tool installed and functional

GPU Node Configuration

Job selection policy

When running a node, you can choose which jobs you want to run by using configuration options, environment variables or flags to specify a job selection policy.

Job selection probes

If you want more control over making the decision to take on jobs, you can use the --job-selection-probe-exec and --job-selection-probe-http flags.

These are external programs that are passed the following data structure so that they can make a decision about whether or not to take on a job:

The exec probe is a script to run that will be given the job data on stdin, and must exit with status code 0 if the job should be run.

The http probe is a URL to POST the job data to. The job will be rejected if the HTTP request returns a non-positive status code (e.g. >= 400).

For example, the following response will reject the job:

If the HTTP response is not a JSON blob, the content is ignored and any non-error status code will accept the job.

Access Management

How to configure authentication and authorization on your Bacalhau node.

Access Management

Bacalhau includes a flexible auth system that supports multiple methods of auth that are appropriate for different deployment environments.

By default

With no specific authentication configuration supplied, Bacalhau runs in "anonymous mode" – which allows unidentified users limited control over the system. "Anonymous mode" is only appropriate for testing or evaluation setups.

In anonymous mode, Bacalhau will allow:

Users identified by a self-generated private key to submit any job and cancel their own jobs.
Users not identified by any key to access other read-only endpoints, such as to read job lists, describe jobs, and query node or agent information.

Restricting anonymous access

Bacalhau auth is controlled by policies. Configuring the auth system is done by supplying a different policy file.

Restricting API access to only users that have authenticated requires specifying a new authorization policy. You can download a policy that restricts anonymous access and install it by using:

Once the node is restarted, accessing the node APIs will require the user to be authenticated, but by default will still allow users with a self-generated key to authenticate themselves.

Restricting the list of keys that can authenticate to only a known set requires specifying a new authentication policy. You can download a policy that restricts key-based access and install it by using:

Then, modify the allowed_clients variable in challange_ns_no_anon.rego to include acceptable client IDs, found by running bacalhau agent node.

Once the node is restarted, only keys in the allowed list will be able to access any API.

Username and password access

Users can authenticate using a username and password instead of specifying a private key for access. Again, this requires installation of an appropriate policy on the server.

Passwords are not stored in plaintext and are salted. The downloaded policy expects password hashes and salts generated by scrypt. To generate a salted password, the helper script in pkg/authn/ask/gen_password can be used:

This will ask for a password and generate a salt and hash to authenticate with it. Add the encoded username, salt and hash into the ask_ns_password.rego.

Writing custom policies

In principle, Bacalhau can implement any auth scheme that can be described in a structured way by a policy file.

Custom authentication policies

Bacalhau will pass information pertinent to the current request into every authentication policy query as a field on the input variable. The exact information depends on the type of authentication used.

`challenge` authentication

challenge authentication uses identifies the user by the presence of a private key. The user is asked to sign an input phrase to prove they have the key they are identifying with.

Policies used for challenge authentication do not need to actually implement the challenge verification logic as this is handled by the core code. Instead, they will only be invoked if this verification passes.

Policies for this type will need to implement these rules:

bacalhau.authn.token: if the user should be authenticated, an access token they should use in subsequent requests. If the user should not be authenticated, should be undefined.

They should expect as fields on the input variable:

clientId: an ID derived from the user's private key that identifies them uniquely
nodeId: the ID of the requester node that this user is authenticating with
signingKey: the private key (as a JWK) that should be used to sign any access tokens to be returned

The simplest possible policy might therefore be this policy that returns the same opaque token for all users:

`ask` authentication

ask authentication uses credentials supplied manually by the user as identification. For example, an ask policy could require a username and password as input and check these against a known list. ask policies do all the verification of the supplied credentials.

Policies for this type will need to implement these rules:

bacalhau.authn.token: if the user should be authenticated, an access token they should use in subsequent requests. If the user should not be authenticated, should be undefined.
bacalhau.authn.schema: a static JSON schema that should be used to collect information about the user. The type of declared fields may be used to pick the input method, and if a field is marked as writeOnly then it will be collected in a secure way (e.g. not shown on screen). The schema rule does not receive any input data.

They should expect as fields on the input variable:

ask: a map of field names from the JSON schema to strings supplied by the user. The policy should validate these credentials.
nodeId: the ID of the requester node that this user is authenticating with
signingKey: the private key (as a JWK) that should be used to sign any access tokens to be returned

The simplest possible policy might therefore be one that asks for no data and returns the same opaque token for every user:

Custom authorization policies

Authorization policies do not vary depending on the type of authentication used – Bacalhau uses one authz policy for all API requests.

Authz policies are invoked for every API request. Authz policies should check the validity of any supplied access tokens and issue an authz decision for the requested API endpoint. It is not required that authz policies enforce that an access token is present – they may choose to grant access to unauthorized users.

Policies will need to implement these rules:

bacalhau.authz.token_valid: true if the access token in the request is "valid" (but does not necessarily grant access for this request), or false if it is invalid for every request (e.g. because it has expired) and should be discarded.
bacalhau.authz.allow: true if the user should be permitted to carry out the input request, false otherwise.

They should expect as fields on the input variable for both rules:

http: details of the user's HTTP request:
- host: the hostname used in the HTTP request
- method: the HTTP method (e.g. GET, POST)
- path: the path requested, as an array of path components without slashes
- query: a map of URL query parameters to their values
- headers: a map of HTTP header names to arrays representing their values
- body: a blob of any content submitted as the body
constraints: details about the receiving node that should be used to validate any supplied tokens:
- cert: keys that the input token should have been signed with
- iss: the name of a node that this node will recognize as the issuer of any signed tokens
- aud: the name of this node that is receiving the request

Notably, the constraints data is appropriate to be passed directly to the Rego io.jwt.decode_verify method which will validate the access token as a JWT against the given constraints.

The simplest possible authz policy might be this one that allows all users to access all endpoints:

Node persistence

How to configure compute/requester persistence

Both compute nodes, and requester nodes, maintain state. How that state is maintained is configurable, although the defaults are likely adequate for most use-cases. This page describes how to configure the persistence of compute and requester nodes should the defaults not be suitable.

Compute node persistence

The computes nodes maintain information about the work that has been allocated to them, including:

The current state of the execution, and
The original job that resulted in this allocation

This information is used by the compute and requester nodes to ensure allocated jobs are completed successfully. By default, compute nodes store their state in a bolt-db database and this is located in the bacalhau repository along with configuration data. For a compute node whose ID is "abc", the database can be found in ~/.bacalhau/abc-compute/executions.db.

In some cases, it may be preferable to maintain the state in memory, with the caveat that should the node restart, all state will be lost. This can be configured using the environment variables in the table below.

Environment Variable

Flag alternative

Value

Effect

Requester node persistence

When running a requester node, it maintains state about the jobs it has been requested to orchestrate and schedule, the evaluation of those jobs, and the executions that have been allocated. By default, this state is stored in a bolt db database that, with a node ID of "xyz" can be found in ~/.bacalhau/xyz-requester/jobs.db.

Environment Variable

Flag alternative

Value

Effect

Connect Storage

Bacalhau has two ways to make use of external storage providers: Sources and Publishers. Sources storage resources consumed as inputs to jobs. And Publishers storage resources created with the results of jobs.

Sources

Bacalhau allows you to use S3 or any S3-compatible storage service as an input source. Users can specify files or entire prefixes stored in S3 buckets to be fetched and mounted directly into the job execution environment. This capability ensures that your jobs have immediate access to the necessary data. See the for more details.

To use the S3 source, you will have to to specify the mandatory name of the S3 bucket and the optional parameters Key, Filter, Region, Endpoint, VersionID and ChechsumSHA256.

Below is an example of how to define an S3 input source in YAML format:

To start, you'll need to connect the Bacalhau node to an IPFS server so that you can run jobs that consume CIDs as inputs. You can either and run it locally, or you can connect to a remote IPFS server.

In both cases, you should have an for the IPFS server that should look something like this:

The multiaddress above is just an example - you'll need to get the multiaddress of the IPFS server you want to connect to.

You can then configure your Bacalhau node to use this IPFS server by passing the --ipfs-connect argument to the serve command:

Or, set the Node.IPFS.Connect property in the Bacalhau configuration file. See the for more details.

Below is an example of how to define an IPFS input source in YAML format:

To use a local data source, you will have to to:

Enable the use of local data when configuring the node itself by using the --allow-listed-local-paths flag for bacalhau serve, specifying the file path and access mode. For example

In the job description specify parameters SourcePath - the absolute path on the compute node where your data is located and ReadWrite - the access mode.

Below is an example of how to define a Local input source in YAML format:

To use a URL data source, you will have to to specify only URL parameter, as in the part of the declarative job description below:

Publishers

Bacalhau's S3 Publisher provides users with a secure and efficient method to publish job results to any S3-compatible storage service. To use an S3 publisher you will have to specify required parameters Bucket and Key and optional parameters Region, Endpoint, VersionID, ChecksumSHA256. See the for more details.

Here’s an example of the part of the declarative job description that outlines the process of using the S3 Publisher with Bacalhau:

The IPFS publisher works using the same setup as - you'll need to have an IPFS server running and a multiaddress for it. Then you'll pass that multiaddress using the --ipfs-connect argument to the serve command. If you are publishing to a public IPFS node, you can use bacalhau job get with no further arguments to download the results. However, you may experience a delay in results becoming available as indexing of new data by public nodes takes time.

To use the IPFS publisher you will have to specify CID which can be used to access the published content. See the for more details.

To speed up the download or to retrieve results from a private IPFS node, pass the swarm multiaddress to bacalhau job get to download results.

Pass the swarm key to bacalhau job get if the IPFS swarm is a private swarm.

And part of the declarative job description with an IPFS publisher will look like this:

The Local Publisher should not be used for Production use as it is not a reliable storage option. For production use, we recommend using a more reliable option such as an S3-compatible storage service.

Here is an example of part of the declarative job description with a local publisher:

Configuration Management

How to configure your Bacalhau node.

Bacalhau employs the viper and cobra libraries for configuration management. Users can configure their Bacalhau node through a combination of command-line flags, environment variables, and the dedicated configuration file.

The Bacalhau Repo

Bacalhau manages its configuration, metadata, and internal state within a specialized repository named .bacalhau. Serving as the heart of the Bacalhau node, this repository holds the data and settings that determine node behavior. It's located on the filesystem, and by default, Bacalhau initializes this repository at $HOME/.bacalhau, where $HOME is the home directory of the user running the bacalhau process.

To customize this location, users can:

Set the BACALHAU_DIR environment variable to specify their desired path.
Utilize the --repo command line flag to specify their desired path.

Upon executing a Bacalhau command for the first time, the system will initialize the .bacalhau repository. If such a repository already exists, Bacalhau will seamlessly access its contents.

Structure of a Newly Initialized .bacalhau Repository

Below is the structure of a freshly initialized `.bacalhau` repository:

$ tree ~/.bacalhau
├── QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv-compute/
│   ├── executions.db
│   └── jobStats.json
├── QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv-requester/
│   └── jobs.db
├── config.yaml
├── executor_storages/
├── libp2p_private_key
├── plugins/
├── repo.version
└── user_id.pem

This repository comprises four directories and seven files:

Files

user_id.pem:
- This file houses the Bacalhau node user's cryptographic private key, used for signing requests sent to a Requester Node.
- Format: PEM.
repo.version:
- Indicates the version of the Bacalhau node's repository.
- Format: JSON, e.g., {"Version":1}.
libp2p_private_key:
- Stores the Bacalhau node's libp2p private key, essential for its network identity. The NodeID of a Bacalhau node is derived from this key.
- Format: Base64 encoded RSA private key.
config.yaml:
- Contains configuration settings for the Bacalhau node.
- Format: YAML.
update.json:
- A file containing the date/time when the last version check was made.
- Format: JSON, e.g., {"LastCheck":"2024-01-24T11:06:14.631816Z"}
tokens.json:
- A file containing the tokens obtained through authenticating with bacalhau clusters.

Directories

QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv-compute:
- Contains the BoltDB executions.db database, which aids the Compute node in state persistence. Additionally, the jobStats.json file records the Compute Node's completed jobs tally.
- Note: The segment QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv is a unique NodeID for each Bacalhau node, derived from the libp2p_private_key.
QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv-requester:
- Contains the BoltDB jobs.db database for the Requester node's state persistence.
- Note: NodeID derivation is similar to the Compute directory.
executor_storages:
- Storage for data handled by Bacalhau storage drivers.
plugins:
- Houses binaries that allow the Compute node to execute specific tasks.
- Note: This feature is currently experimental and isn't active during standard node operations.

Configuring a Bacalhau Node

Within a .bacalhau repository, a config.yaml file may be present. This file serves as the configuration source for the bacalhau node and adheres to the YAML format.

Although the config.yaml file is optional, its presence allows Bacalhau to load custom configurations; otherwise, Bacalhau is configured with built-in default values, environment variables and command line flags.

Modifications to the config.yaml file will not be dynamically loaded by the Bacalhau node. A restart of the node is required for any changes to take effect. Bacalhau determines its configuration based on the following precedence order, with each item superseding the subsequent:

Command-line Flag
Environment Variable
Config File
Defaults

Relationship Between `config.yaml` and Bacalhau Environment Variables

Bacalhau establishes a direct relationship between the value-bearing keys within the config.yaml file and corresponding environment variables. For these keys that have no further sub-keys, the environment variable name is constructed by capitalizing each segment of the key, and then joining them with underscores, prefixed with BACALHAU_.

For example, a YAML key with the path Node.IPFS.Connect translates to the environment variable BACALHAU_NODE_IPFS_CONNECT and is represented in a file like:

Node:
    IPFS:
        Connect: value

There is no corresponding environment variable for either Node or Node.IPFS. Config values may also have other environment variables that set them for simplicity or to maintain backwards compatibility.

Environments

Bacalhau leverages the BACALHAU_ENVIRONMENT environment variable to determine the specific environment configuration when initializing a repository. Notably, if a .bacalhau repository has already been initialized, the BACALHAU_ENVIRONMENT setting will be ignored.
By default, if the BACALHAU_ENVIRONMENT variable is not explicitly set by the user, Bacalhau will adopt the production environment settings.
Below is a breakdown of the configurations associated with each environment:
1. Production (public network)
- Environment Variable: BACALHAU_ENVIRONMENT=production
- Configurations:
  - Node.ClientAPI.Host: "bootstrap.production.bacalhau.org"
  - Node.Client.API.Host: 1234
  - ...other configurations specific to this environment...
2. Staging (staging network)
- Environment Variable: BACALHAU_ENVIRONMENT=staging
- Configurations:
  - Node.ClientAPI.Host: "bootstrap.staging.bacalhau.org"
  - Node.Client.API.Host: 1234
  - ...other configurations specific to this environment...
3. Development (development network)
- Environment Variable: BACALHAU_ENVIRONMENT=development
- Configurations:
  - Node.ClientAPI.Host: "bootstrap.development.bacalhau.org"
  - Node.Client.API.Host: 1234
  - ...other configurations specific to this environment...
4. Local (private or local networks)
- Environment Variable: BACALHAU_ENVIRONMENT=local
- Configurations:
  - Node.ClientAPI.Host: "0.0.0.0"
  - Node.Client.API.Host: 1234
  - ...other configurations specific to this environment...
Note: The above configurations provided for each environment are not exhaustive. Consult the specific environment documentation for a comprehensive list of configurations.

Usage Examples

How to initialize a Bacalhau Server for a local private network

$ env BACALHAU_ENVIRONMENT=local ./bin/darwin_arm64/bacalhau serve
INF pkg/repo/fs.go:187 > Initializing repo at '/Users/frrist/.bacalhau' for environment 'local'

How to initialize a Bacalhau Server with a custom repo path

$ bacalhau --repo=/path/to/repo serve
INF pkg/repo/fs.go:187 > Initializing repo at '/path/to/repo' for environment 'production'

$ export BACALHAU_DIR=/path/to/repo
$ bacalhau serve
INF pkg/repo/fs.go:187 > Initializing repo at '/path/to/repo' for environment 'production'

How to start a Bacalhau Server with DEBUG logs

$ env LOG_LEVEL=debug ./bin/darwin_arm64/bacalhau serve
DBG pkg/system/environment.go:53 > Defaulting to production environment: os.Args: [./bin/darwin_arm64/bacalhau serve]

Configuring Transport Level Security

How to configure TLS for the requester node APIs

By default, the requester node APIs used by the Bacalhau CLI are accessible over HTTP, but it is possible to configure it to use Transport Level Security (TLS) so that they are accessible over HTTPS instead. There are several ways to obtain the necessary certificates and keys, and Bacalhau supports obtaining them via ACME and Certificate Authorities or even self-signing them.

Once configured, you must ensure that instead of using http://IP:PORT you use https://IP:PORT to access the Bacalhau API

Getting a certificate from Let's Encrypt with ACME

Automatic Certificate Management Environment (ACME) is a protocol that allows for automating the deployment of Public Key Infrastructure, and is the protocol used to obtain a free certificate from the Certificate Authority.

Using the --autocert [hostname] parameter to the CLI (in the serve and devstack commands), a certificate is obtained automatically from Lets Encrypt. The provided hostname should be a comma-separated list of hostnames, but they should all be publicly resolvable as Lets Encrypt will attempt to connect to the server to verify ownership (using the challenge). On the very first request this can take a short time whilst the first certificate is issued, but afterwards they are then cached in the bacalhau repository.

Alternatively, you may set these options via the environment variable, BACALHAU_AUTO_TLS. If you are using a configuration file, you can set the values inNode.ServerAPI.TLS.AutoCert instead.

As a result of the Lets Encrypt verification step, it is necessary for the server to be able to handle requests on port 443. This typically requires elevated privileges, and rather than obtain these through a privileged account (such as root), you should instead use setcap to grant the executable the right to bind to ports <1024.

A cache of ACME data is held in the config repository, by default ~/.bacalhau/autocert-cache, and this will be used to manage renewals to avoid rate limits.

Getting a certificate from a Certificate Authority

Obtaining a TLS certificate from a Certificate Authority (CA) without using the Automated Certificate Management Environment (ACME) protocol involves a manual process that typically requires the following steps:

Choose a Certificate Authority: First, you need to select a trusted Certificate Authority that issues TLS certificates. Popular CAs include DigiCert, GlobalSign, Comodo (now Sectigo), and others. You may also consider whether you want a free or paid certificate, as CAs offer different pricing models.
Generate a Certificate Signing Request (CSR): A CSR is a text file containing information about your organization and the domain for which you need the certificate. You can generate a CSR using various tools or directly on your web server. Typically, this involves providing details such as your organization's name, common name (your domain name), location, and other relevant information.
Submit the CSR: Access your chosen CA's website and locate their certificate issuance or order page. You'll typically find an option to "Submit CSR" or a similar option. Paste the contents of your CSR into the provided text box.
Verify Domain Ownership: The CA will usually require you to verify that you own the domain for which you're requesting the certificate. They may send an email to one of the standard domain-related email addresses (e.g., admin@yourdomain.com, webmaster@yourdomain.com). Follow the instructions in the email to confirm domain ownership.
Complete Additional Verification: Depending on the CA's policies and the type of certificate you're requesting (e.g., Extended Validation or EV certificates), you may need to provide additional documentation to verify your organization's identity. This can include legal documents or phone calls from the CA to confirm your request.
Payment and Processing: If you're obtaining a paid certificate, you'll need to make the payment at this stage. Once the CA has received your payment and completed the verification process, they will issue the TLS certificate.

Once you have obtained your certificates, you will need to put two files in a location that bacalhau can read them. You need the server certificate, often called something like server.cert or server.cert.pem, and the server key which is often called something like server.key or server.key.pem.

Once you have these two files available, you must start bacalhau serve which two new flags. These are tlscert and tlskey flags, whose arguments should point to the relevant file. An example of how it is used is:

Alternatively, you may set these options via the environment variables, BACALHAU_TLS_CERT and BACALHAU_TLS_KEY. If you are using a configuration file, you can set the values inNode.ServerAPI.TLS.ServerCertificate and Node.ServerAPI.TLS.ServerKey instead.

Self-signed certificates

Once you have generated the necessary files, the steps are much like above, you must start bacalhau serve which two new flags. These are tlscert and tlskey flags, whose arguments should point to the relevant file. An example of how it is used is:

Alternatively, you may set these options via the environment variables, BACALHAU_TLS_CERT and BACALHAU_TLS_KEY. If you are using a configuration file, you can set the values inNode.ServerAPI.TLS.ServerCertificate and Node.ServerAPI.TLS.ServerKey instead.

If you use self-signed certificates, it is unlikely that any clients will be able to verify the certificate when connecting to the Bacalhau APIs. There are three options available to work around this problem:

Provide a CA certificate file of trusted certificate authorities, which many software libraries support in addition to system authorities.
Install the CA certificate file in the system keychain of each machine that needs access to the Bacalhau APIs.
Instruct the software library you are using not to verify HTTPS requests.

Limits and Timeouts

Resource Limits

These are the flags that control the capacity of the Bacalhau node, and the limits for jobs that might be run.

The --limit-total-* flags control the total system resources you want to give to the network. If left blank, the system will attempt to detect these values automatically.

The --limit-job-* flags control the maximum amount of resources a single job can consume for it to be selected for execution.

Resource limits are not supported for Docker jobs running on Windows. Resource limits will be applied at the job bid stage based on reported job requirements but will be silently unenforced. Jobs will be able to access as many resources as requested at runtime.

Windows Support

Running a Windows-based node is not officially supported, so your mileage may vary. Some features (like ) are not present in Windows-based nodes.

Bacalhau currently makes the assumption that all containers are Linux-based. Users of the Docker executor will need to manually ensure that their Docker engine is running and to support Linux containers, e.g. using the WSL-based backend.

Timeouts

Bacalhau can limit the total time a job spends executing. A job that spends too long executing will be cancelled, and no results will be published.

By default, a Bacalhau node does not enforce any limit on job execution time. Both node operators and job submitters can supply a maximum execution time limit. If a job submitter asks for a longer execution time than permitted by a node operator, their job will be rejected.

Configuring Execution Time Limits

Job submitters can pass the --timeout flag to any Bacalhau job submission CLI to set a maximum job execution time. The supplied value should be a whole number of seconds with no unit.

The timeout can also be added to an existing job spec by adding the Timeout property to the Spec.

Node operators can pass the --max-job-execution-timeout flag to bacalhau serve to configure the maximum job time limit. The supplied value should be a numeric value followed by a time unit (one of s for seconds, m for minutes or h for hours).

Node operators can also use configuration properties to configure execution limits.

Compute nodes will use the properties:

Requester nodes will use the properties:

Test Network Locally

Before you join the main Bacalhau network, you can test locally.

To test, you can use the bacalhau devstack command, which offers a way to get a 3 node cluster running locally.

By settings PREDICTABLE_API_PORT=1 , the first node of our 3 node cluster will always listen on port 20000

In another window, export the following environment variables so that the Bacalhau client binary connects to our local development cluster:

You can now interact with Bacalhau - all jobs are running by the local devstack cluster.

Bacalhau WebUI

How to run the WebUI.

Overview

The Bacalhau WebUI offers an intuitive interface for interacting with the Bacalhau network. This guide provides comprehensive instructions for setting up, deploying, and utilizing the WebUI.

For contributing to the WebUI's development, please refer to the Bacalhau WebUI GitHub Repository.

Spinning Up the WebUI Locally

Prerequisites

Ensure you have a Bacalhau v1.1.7 or later installed.

Running the WebUI

To launch the WebUI locally, execute the following command:

bacalhau serve --node-type=requester,compute --web-ui

This command initializes a requester and compute node, configured to listen on HOST=0.0.0.0 and PORT=1234.

Accessing the Local WebUI

Once started, the WebUI is accessible at (http://127.0.0.1/). This local instance allows you to interact with your local Bacalhau network setup.

Accessing the WebUI from the Browser

For observational purposes, a development version of the WebUI is available at bootstrap.development.bacalhau.org. This instance displays jobs from the development server.

N.b. The development version of the WebUI is for observation only and may not reflect the latest changes or features available in the local setup.

Private IPFS Network Setup

Set up private IPFS network

Note that currently Bacalhau v1.4.0 supports IPFS v0.27 and below. Support for later versions of IPFS will be added in the next versions.

Introduction

Support for the embedded node was in v1.4.0 to streamline communication and reduce overhead. Therefore, now in order to use a private IPFS network, it is necessary to create it yourself and then connect to it with nodes. This manual describes how to:

Install and configure IPFS
Create Private IPFS network
Configure your to use the private IPFS network
Pin your data to private IPFS network

TL;DR

Install on all nodes
Install
Initialize Private IPFS network
Connect all nodes to the same private network
Connect Bacalhau network to use private IPFS network

Download and Install

Remove any previous Go installation by deleting the /usr/local/go folder (if it exists), then extract the archive you downloaded into /usr/local, creating a fresh Go tree in /usr/local/go:

Add /usr/local/go/bin to the PATH environment variable. You can do this by adding the following line to your $HOME/.profile or /etc/profile (for a system-wide installation):

Changes made to a profile file may not apply until the next time you log into the system. To apply the changes immediately, just run the shell commands directly or execute them from the profile using a command such as source $HOME/.profile.

Verify that Go is installed correctly by checking its version:

Verify that IPFS is installed correctly by checking its version:

Configure Bootstrap IPFS Node

A bootstrap node is used by client nodes to connect to the private IPFS network. The bootstrap connects clients to other nodes available on the network.

Execute the ipfs init command to initialize an IPFS node:

The next step is to generate the swarm key - a cryptographic key that is used to control access to an IPFS network, and export the key into a swarm.key file, located in the ~/ipfs folder.

Now the default entries of bootstrap nodes should be removed. Execute the command on all nodes:

Check that bootstrap config does not contain default values:

Configure IPFS to listen for incoming connections on specific network addresses and ports, making the IPFS Gateway and API services accessible. Consider changing addresses and ports depending on the specifics of your network.

Start the IPFS daemon:

Configure Client Nodes

Copy the swarm.key file from the bootstrap node to client nodes into the ~/.ipfs/ folder and initialize IPFS:

Apply same config as on bootstrap node and start the daemon:

Done! Now you can check that private IPFS network works properly:

List peers on the bootstrap node. It should list all connected nodes:

Pin some files and check their availability across the network:

Configure the IPFS Daemon as `systemd` Service

Finally, make the IPFS daemon run at system startup. To do this:

Create new service unit file in the /etc/systemd/system/

Add following content to the file, replacing /path/to/your/ipfs/executable with the actual path

Use which ipfs command to locate the executable.

Usually path to the executable is /usr/local/bin/ipfs

For security purposes, consider creating a separate user to run the service. In this case, specify its name in the User= line. Without specifying user, the ipfs service will be launched with root, which means that you will need to copy the ipfs binary to the /root directory

Reload and enable the service

Done! Now reboot the machine to ensure that daemon starts correctly. Use systemctl status ipfs command to check that service is running:

Configure Bacalhau Nodes

Now to connect your private Bacalhau network to the private IPFS network, the IPFS API address should be specified using the --ipfs-connect flag. It can be found in the ~/.ipfs/api file:

Done! Now your private Bacalhau network is connected to the private IPFS network!

Test Configured Networks

To verify that everything works correctly:

Pin the file to the private IPFS network
Run the job, which takes the pinned file as input and publishes result to the private IPFS network
View and download job results

Create and Pin Sample File

Create any file and pin it. Use the ipfs add command:

Run a Bacalhau Job

Run a simple job, which fetches the pinned file via its CID, lists its content and publishes results back into the private IPFS network:

View and Download Job Results

Use the ipfs ls command to view the results:

Use the ipfs cat command to view the file content. In our case, the file of interest is the stdout:

Use the ipfs get command to download the file using its CID:

Configuration Management

How to configure your Bacalhau node.

The Bacalhau Repo

To customize this location, users can:

Set the BACALHAU_DIR environment variable to specify their desired path.
Utilize the --repo command line flag to specify their desired path.

Upon executing a Bacalhau command for the first time, the system will initialize the .bacalhau repository. If such a repository already exists, Bacalhau will seamlessly access its contents.

Structure of a Newly Initialized .bacalhau Repository

Below is the structure of a freshly initialized `.bacalhau` repository:

$ tree ~/.bacalhau
├── QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv-compute/
│   ├── executions.db
│   └── jobStats.json
├── QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv-requester/
│   └── jobs.db
├── config.yaml
├── executor_storages/
├── libp2p_private_key
├── plugins/
├── repo.version
└── user_id.pem

This repository comprises four directories and seven files:

Files

user_id.pem:
- This file houses the Bacalhau node user's cryptographic private key, used for signing requests sent to a Requester Node.
- Format: PEM.
repo.version:
- Indicates the version of the Bacalhau node's repository.
- Format: JSON, e.g., {"Version":1}.
libp2p_private_key:
- Stores the Bacalhau node's libp2p private key, essential for its network identity. The NodeID of a Bacalhau node is derived from this key.
- Format: Base64 encoded RSA private key.
config.yaml:
- Contains configuration settings for the Bacalhau node.
- Format: YAML.
update.json:
- A file containing the date/time when the last version check was made.
- Format: JSON, e.g., {"LastCheck":"2024-01-24T11:06:14.631816Z"}
tokens.json:
- A file containing the tokens obtained through authenticating with bacalhau clusters.

Directories

QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv-compute:
- Contains the BoltDB executions.db database, which aids the Compute node in state persistence. Additionally, the jobStats.json file records the Compute Node's completed jobs tally.
- Note: The segment QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv is a unique NodeID for each Bacalhau node, derived from the libp2p_private_key.
QmdGUjsMHEgtAfdtw7U62yPEcAZFtA33tKMsczLToegZtv-requester:
- Contains the BoltDB jobs.db database for the Requester node's state persistence.
- Note: NodeID derivation is similar to the Compute directory.
executor_storages:
- Storage for data handled by Bacalhau storage drivers.
plugins:
- Houses binaries that allow the Compute node to execute specific tasks.
- Note: This feature is currently experimental and isn't active during standard node operations.

Configuring a Bacalhau Node

Within a .bacalhau repository, a config.yaml file may be present. This file serves as the configuration source for the bacalhau node and adheres to the YAML format.

Command-line Flag
Environment Variable
Config File
Defaults

Relationship Between `config.yaml` and Bacalhau Environment Variables

For example, a YAML key with the path Node.IPFS.Connect translates to the environment variable BACALHAU_NODE_IPFS_CONNECT and is represented in a file like:

Node:
    IPFS:
        Connect: value

Environments

Bacalhau leverages the BACALHAU_ENVIRONMENT environment variable to determine the specific environment configuration when initializing a repository. Notably, if a .bacalhau repository has already been initialized, the BACALHAU_ENVIRONMENT setting will be ignored.
By default, if the BACALHAU_ENVIRONMENT variable is not explicitly set by the user, Bacalhau will adopt the production environment settings.
Below is a breakdown of the configurations associated with each environment:
1. Production (public network)
- Environment Variable: BACALHAU_ENVIRONMENT=production
- Configurations:
  - Node.ClientAPI.Host: "bootstrap.production.bacalhau.org"
  - Node.Client.API.Host: 1234
  - ...other configurations specific to this environment...
2. Staging (staging network)
- Environment Variable: BACALHAU_ENVIRONMENT=staging
- Configurations:
  - Node.ClientAPI.Host: "bootstrap.staging.bacalhau.org"
  - Node.Client.API.Host: 1234
  - ...other configurations specific to this environment...
3. Development (development network)
- Environment Variable: BACALHAU_ENVIRONMENT=development
- Configurations:
  - Node.ClientAPI.Host: "bootstrap.development.bacalhau.org"
  - Node.Client.API.Host: 1234
  - ...other configurations specific to this environment...
4. Local (private or local networks)
- Environment Variable: BACALHAU_ENVIRONMENT=local
- Configurations:
  - Node.ClientAPI.Host: "0.0.0.0"
  - Node.Client.API.Host: 1234
  - ...other configurations specific to this environment...
Note: The above configurations provided for each environment are not exhaustive. Consult the specific environment documentation for a comprehensive list of configurations.

Usage Examples

How to initialize a Bacalhau Server for a local private network

$ env BACALHAU_ENVIRONMENT=local ./bin/darwin_arm64/bacalhau serve
INF pkg/repo/fs.go:187 > Initializing repo at '/Users/frrist/.bacalhau' for environment 'local'

How to initialize a Bacalhau Server with a custom repo path

$ bacalhau --repo=/path/to/repo serve
INF pkg/repo/fs.go:187 > Initializing repo at '/path/to/repo' for environment 'production'

$ export BACALHAU_DIR=/path/to/repo
$ bacalhau serve
INF pkg/repo/fs.go:187 > Initializing repo at '/path/to/repo' for environment 'production'

How to start a Bacalhau Server with DEBUG logs

$ env LOG_LEVEL=debug ./bin/darwin_arm64/bacalhau serve
DBG pkg/system/environment.go:53 > Defaulting to production environment: os.Args: [./bin/darwin_arm64/bacalhau serve]

Access Management

How to configure authentication and authorization on your Bacalhau node.

Access Management

Bacalhau includes a flexible auth system that supports multiple methods of auth that are appropriate for different deployment environments.

By default

In anonymous mode, Bacalhau will allow:

Users identified by a self-generated private key to submit any job and cancel their own jobs.
Users not identified by any key to access other read-only endpoints, such as to read job lists, describe jobs, and query node or agent information.

Restricting anonymous access

Bacalhau auth is controlled by policies. Configuring the auth system is done by supplying a different policy file.

Restricting API access to only users that have authenticated requires specifying a new authorization policy. You can download a policy that restricts anonymous access and install it by using:

Once the node is restarted, accessing the node APIs will require the user to be authenticated, but by default will still allow users with a self-generated key to authenticate themselves.

Then, modify the allowed_clients variable in challange_ns_no_anon.rego to include acceptable client IDs, found by running bacalhau agent node.

bacalhau agent node | jq -rc .ClientID

Once the node is restarted, only keys in the allowed list will be able to access any API.

Username and password access

Users can authenticate using a username and password instead of specifying a private key for access. Again, this requires installation of an appropriate policy on the server.

curl -sL https://raw.githubusercontent.com/bacalhau-project/bacalhau/main/pkg/authn/ask/ask_ns_password.rego -o ~/.bacalhau/ask_ns_password.rego
bacalhau config set Auth.Methods '\{Method: Password, Policy: \{Type: ask, PolicyPath: ~/.bacalhau/ask_ns_password.rego\}\}'

cd pkg/authn/ask/gen_password && go run .

This will ask for a password and generate a salt and hash to authenticate with it. Add the encoded username, salt and hash into the ask_ns_password.rego.

Writing custom policies

In principle, Bacalhau can implement any auth scheme that can be described in a structured way by a policy file.

Policies are written in a language called , also used by Kubernetes. Users who want to write their own policies should get familiar with the Rego language.

Custom authentication policies

`challenge` authentication

challenge authentication uses identifies the user by the presence of a private key. The user is asked to sign an input phrase to prove they have the key they are identifying with.

Policies for this type will need to implement these rules:

bacalhau.authn.token: if the user should be authenticated, an access token they should use in subsequent requests. If the user should not be authenticated, should be undefined.

They should expect as fields on the input variable:

clientId: an ID derived from the user's private key that identifies them uniquely
nodeId: the ID of the requester node that this user is authenticating with
signingKey: the private key (as a JWK) that should be used to sign any access tokens to be returned

The simplest possible policy might therefore be this policy that returns the same opaque token for all users:

package bacalhau.authn

token := "anything"

A more realistic example that returns a signed JWT is in .

`ask` authentication

Policies for this type will need to implement these rules:

bacalhau.authn.token: if the user should be authenticated, an access token they should use in subsequent requests. If the user should not be authenticated, should be undefined.
bacalhau.authn.schema: a static JSON schema that should be used to collect information about the user. The type of declared fields may be used to pick the input method, and if a field is marked as writeOnly then it will be collected in a secure way (e.g. not shown on screen). The schema rule does not receive any input data.

They should expect as fields on the input variable:

ask: a map of field names from the JSON schema to strings supplied by the user. The policy should validate these credentials.
nodeId: the ID of the requester node that this user is authenticating with
signingKey: the private key (as a JWK) that should be used to sign any access tokens to be returned

The simplest possible policy might therefore be one that asks for no data and returns the same opaque token for every user:

package bacalhau.authn

schema := {}
token := "anything"

A more realistic example that returns a signed JWT is in .

Custom authorization policies

Authorization policies do not vary depending on the type of authentication used – Bacalhau uses one authz policy for all API requests.

Policies will need to implement these rules:

bacalhau.authz.token_valid: true if the access token in the request is "valid" (but does not necessarily grant access for this request), or false if it is invalid for every request (e.g. because it has expired) and should be discarded.
bacalhau.authz.allow: true if the user should be permitted to carry out the input request, false otherwise.

They should expect as fields on the input variable for both rules:

http: details of the user's HTTP request:
- host: the hostname used in the HTTP request
- method: the HTTP method (e.g. GET, POST)
- path: the path requested, as an array of path components without slashes
- query: a map of URL query parameters to their values
- headers: a map of HTTP header names to arrays representing their values
- body: a blob of any content submitted as the body
constraints: details about the receiving node that should be used to validate any supplied tokens:
- cert: keys that the input token should have been signed with
- iss: the name of a node that this node will recognize as the issuer of any signed tokens
- aud: the name of this node that is receiving the request

Notably, the constraints data is appropriate to be passed directly to the Rego io.jwt.decode_verify method which will validate the access token as a JWT against the given constraints.

The simplest possible authz policy might be this one that allows all users to access all endpoints:

package bacalhau.authz

allow := true
token_valid := true

A more realistic example (which is the Bacalhau "anonymous mode" default) is in .

Private IPFS Network Setup

Set up private IPFS network

Note that currently Bacalhau v1.4.0 supports IPFS v0.27 and below. Support for later versions of IPFS will be added in the next versions.

Introduction

Install and configure IPFS
Create Private IPFS network
Configure your to use the private IPFS network
Pin your data to private IPFS network

TL;DR

Install on all nodes
Install
Initialize Private IPFS network
Connect all nodes to the same private network
Connect Bacalhau network to use private IPFS network

Download and Install

In this manual (the earliest and most widely used implementation of IPFS) will be used, so first of all, should be installed.

See the page for latest Go version.

wget https://go.dev/dl/go1.23.0.linux-amd64.tar.gz

Remove any previous Go installation by deleting the /usr/local/go folder (if it exists), then extract the archive you downloaded into /usr/local, creating a fresh Go tree in /usr/local/go:

rm -rf /usr/local/go && tar -C /usr/local -xzf go1.23.0.linux-amd64.tar.gz

Add /usr/local/go/bin to the PATH environment variable. You can do this by adding the following line to your $HOME/.profile or /etc/profile (for a system-wide installation):

export PATH=$PATH:/usr/local/go/bin

Verify that Go is installed correctly by checking its version:

go version

The next step is to download and install Kubo. the appropriate version for your system. It is recommended to use the latest stable version.

wget https://dist.ipfs.tech/kubo/v0.29.0/kubo_v0.29.0_linux-amd64.tar.gz
tar -xvzf kubo_v0.29.0_linux-amd64.tar.gz
sudo bash kubo/install.sh

Verify that IPFS is installed correctly by checking its version:

ipfs --version

Configure Bootstrap IPFS Node

A bootstrap node is used by client nodes to connect to the private IPFS network. The bootstrap connects clients to other nodes available on the network.

Execute the ipfs init command to initialize an IPFS node:

ipfs init

# example output

generating ED25519 keypair...done
peer identity: 12D3KooWQqr8BLHDUaZvYG59KnrfYJ1PbbzCq3pzfpQ6QrKP5yz7
initializing IPFS node at /home/username/.ipfs

The next step is to generate the swarm key - a cryptographic key that is used to control access to an IPFS network, and export the key into a swarm.key file, located in the ~/ipfs folder.

echo -e "/key/swarm/psk/1.0.0/\n/base16/" > swarm.key
ipfs key gen swarmkey >> ~/.ipfs/swarm.key

# example swarm.key content:

/key/swarm/psk/1.0.0/
/base16/
k51qzi5uqu5dli3yce3powa8pme8yc2mcwc3gpfwh7hzkzrvp5c6l0um99kiw2

Now the default entries of bootstrap nodes should be removed. Execute the command on all nodes:

ipfs bootstrap rm --all

Check that bootstrap config does not contain default values:

ipfs config show | grep Bootstrap

# expected output:

  "Bootstrap": null,

ipfs config Addresses.Gateway /ip4/0.0.0.0/tcp/8080

ipfs config Addresses.API /ip4/0.0.0.0/tcp/5001

Start the IPFS daemon:

ipfs daemon

Configure Client Nodes

Copy the swarm.key file from the bootstrap node to client nodes into the ~/.ipfs/ folder and initialize IPFS:

ipfs init

Apply same config as on bootstrap node and start the daemon:

ipfs bootstrap rm — all

ipfs config Addresses.Gateway /ip4/0.0.0.0/tcp/8080

ipfs config Addresses.API /ip4/0.0.0.0/tcp/5001

ipfs daemon

Done! Now you can check that private IPFS network works properly:

List peers on the bootstrap node. It should list all connected nodes:

ipfs swarm peers

# example output for single connected node

/ip4/10.0.2.15/tcp/4001/p2p/12D3KooWQqr8BLHDUaZvYG59KnrfYJ1PbbzCq3pzfpQ6QrKP5yz7

Pin some files and check their availability across the network:

# Create a sample text file and pin it
echo “Hello from the private IPFS network!” > sample.txt

# Pin file:
ipfs add sample.txt

# example output:

added QmWQeYip3JuwhDFmkDkx9mXG3p83a3zMFfiMfhjS2Zvnms sample.txt
 25 B / 25 B [=========================================] 100.00%

# Retrieve and display the content of a pinned file
# Execute this on any node of your private network
ipfs cat QmWQeYip3JuwhDFmkDkx9mXG3p83a3zMFfiMfhjS2Zvnms

# expected output:

Hello from the private IPFS network!

Configure the IPFS Daemon as `systemd` Service

Finally, make the IPFS daemon run at system startup. To do this:

Create new service unit file in the /etc/systemd/system/

sudo nano /etc/systemd/system/ipfs.service

Add following content to the file, replacing /path/to/your/ipfs/executable with the actual path

[Unit]
Description=IPFS Daemon
After=network.target

[Service]
User=username
ExecStart=/path/to/your/ipfs/executable daemon
Restart=on-failure

[Install]
WantedBy=multi-user.target

Use which ipfs command to locate the executable.

Usually path to the executable is /usr/local/bin/ipfs

Reload and enable the service

sudo systemctl daemon-reload
sudo systemctl enable ipfs

Done! Now reboot the machine to ensure that daemon starts correctly. Use systemctl status ipfs command to check that service is running:

sudo systemctl status ipfs

#example output

● ipfs.service - IPFS Daemon
     Loaded: loaded (/etc/systemd/system/ipfs.service; enabled; preset: enabled)
     Active: active (running) since Wed 2024-09-10 13:24:09 CEST; 16min ago

Configure Bacalhau Nodes

Now to connect your private Bacalhau network to the private IPFS network, the IPFS API address should be specified using the --ipfs-connect flag. It can be found in the ~/.ipfs/api file:

bacalhau serve \
# any other flags
--ipfs-connect /ip4/0.0.0.0/tcp/5001

Done! Now your private Bacalhau network is connected to the private IPFS network!

Test Configured Networks

To verify that everything works correctly:

Pin the file to the private IPFS network
Run the job, which takes the pinned file as input and publishes result to the private IPFS network
View and download job results

Create and Pin Sample File

Create any file and pin it. Use the ipfs add command:

# create file
echo "Hello from private IPFS network!" > file.txt

# pin the file
ipfs add file.txt

# example output:

added QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ file.txt
 33 B / 33 B

Run a Bacalhau Job

Run a simple job, which fetches the pinned file via its CID, lists its content and publishes results back into the private IPFS network:

bacalhau docker run \
-i ipfs://QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ
--publisher ipfs \
alpine cat inputs

# example output

Job successfully submitted. Job ID: j-0402f760-70e3-404a-99a9-c87e200f9dde
Checking job status... (Enter Ctrl+C to exit at any time, your job will continue running):

	Communicating with the network  ................  done ✅  0.0s
	   Creating job for submission  ................  done ✅  0.5s
	               Job in progress  ................  done ✅  1.0s

To get more details about the run, execute:
	bacalhau job describe j-0402f760-70e3-404a-99a9-c87e200f9dde

To get more details about the run executions, execute:
	bacalhau job executions j-0402f760-70e3-404a-99a9-c87e200f9dde

To download the results, execute:
	bacalhau job get j-0402f760-70e3-404a-99a9-c87e200f9dde

View and Download Job Results

Use command to view job execution results:

bacalhau job describe j-0402f760-70e3-404a-99a9-c87e200f9dde        

# example output (was truncated for brevity)

...
Standard Output
Hello from private IPFS network!

Use command to download job results. In this particular case, ipfs publisher was used, so the get command will print the CID of the job results:

bacalhau job get j-0402f760-70e3-404a-99a9-c87e200f9dde

# example output

Fetching results of job 'j-0402f760-70e3-404a-99a9-c87e200f9dde'...
No supported downloader found for the published results. You will have to download the results differently.
[
    {
        "Type": "ipfs",
        "Params": {
            "CID": "QmSskRNnbbw8rNtkLdcJrUS2uC2mhiKofVJsahKRPgbGGj"
        }
    }
]

Use the ipfs ls command to view the results:

ipfs ls QmSskRNnbbw8rNtkLdcJrUS2uC2mhiKofVJsahKRPgbGGj

# example output

QmS6mcrMTFsZnT3wAptqEb8NpBPnv1H6WwZBMzEjT8SSDv 1  exitCode
QmbFMke1KXqnYyBBWxB74N4c5SBnJMVAiMNRcGu6x1AwQH 0  stderr
QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ 33 stdout

Use the ipfs cat command to view the file content. In our case, the file of interest is the stdout:

ipfs cat QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ

# example output

Hello from private IPFS network!

Use the ipfs get command to download the file using its CID:

ipfs get --output stdout QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ
Saving file(s) to stdout
 33 B / 33 B [===============================================] 100.00% 0s

Need Support?

For questions and feedback, please reach out in our

Running Nodes

Node Onboarding

Introduction

Pre-Prerequisites

Add Host/Virtual Machine as a New Node

Add a Cloud Instance as a New Node

Support

GPU Installation

Prerequisites

Nvidia

AMD

Intel

GPU Node Configuration

Job selection policy

Job selection probes

Access Management

Access Management

By default

Restricting anonymous access

Username and password access

Writing custom policies

Custom authentication policies

challenge authentication

ask authentication

Custom authorization policies

Node persistence

Compute node persistence

Requester node persistence

Connect Storage

Sources

Publishers

Configuration Management

The Bacalhau Repo

Below is the structure of a freshly initialized .bacalhau repository:

Files

Directories

Configuring a Bacalhau Node

Relationship Between config.yaml and Bacalhau Environment Variables

Environments

Usage Examples

How to initialize a Bacalhau Server for a local private network

How to initialize a Bacalhau Server with a custom repo path

How to start a Bacalhau Server with DEBUG logs

Configuring Transport Level Security

Getting a certificate from Let's Encrypt with ACME

Getting a certificate from a Certificate Authority

Self-signed certificates

Limits and Timeouts

Resource Limits

Windows Support

Timeouts

Configuring Execution Time Limits

Test Network Locally

Bacalhau WebUI

Overview

Spinning Up the WebUI Locally

Prerequisites

Running the WebUI

Accessing the Local WebUI

Accessing the WebUI from the Browser

Private IPFS Network Setup

Introduction

TL;DR

Download and Install

Configure Bootstrap IPFS Node

Configure Client Nodes

Configure the IPFS Daemon as systemd Service

Configure Bacalhau Nodes

Test Configured Networks

Create and Pin Sample File

Run a Bacalhau Job

View and Download Job Results

Configuration Management

The Bacalhau Repo

Below is the structure of a freshly initialized .bacalhau repository:

Files

Directories

Configuring a Bacalhau Node

Relationship Between config.yaml and Bacalhau Environment Variables

Environments

`challenge` authentication

`ask` authentication

Below is the structure of a freshly initialized `.bacalhau` repository:

Relationship Between `config.yaml` and Bacalhau Environment Variables

Configure the IPFS Daemon as `systemd` Service

Below is the structure of a freshly initialized `.bacalhau` repository:

Relationship Between `config.yaml` and Bacalhau Environment Variables

`challenge` authentication

`ask` authentication

Configure the IPFS Daemon as `systemd` Service