Well done on deploying your Bacalhau cluster! Now that the deployment is finished, this document will help with the next steps. It provides important information on how to interact with and manage the cluster. You'll find details on the outputs from the deployment, including how to set up and connect a Bacalhau Client, and how to authorize and connect a Bacalhau Compute node to the cluster. This guide gives everything needed to start using the Bacalhau setup
After completing the deployment, several outputs will be presented. Below is a description of each output and instructions on how to configure your Bacalhau node using them.
Description: The IP address of the Requester node for the deployment and the endpoint where the Bacalhau API is served.
Usage: Configure the Bacalhau Client to connect to this IP address in the following ways:
Setting the --api-host
CLI Flag:
Setting the BACALHAU_API_HOST
environment variable:
Modifying the Bacalhau Configuration File:
Description: The token used to authorize a client when accessing the Bacalhau API.
Usage: The Bacalhau client prompts for this token when a command is first issued to the Bacalhau API. For example:
Description: The token used to authorize a Bacalhau Compute node to connect to the Requester Node.
Usage: A Bacalhau Compute node can be connected to the Requester Node using the following command:
Welcome to the guide for setting up your own Bacalhau cluster across multiple Google Cloud Platform (GCP) regions! This guide will walk you through creating a robust, distributed compute cluster that's perfect for running your Bacalhau workloads.
Think of this as building your own distributed supercomputer! Your cluster will provision compute nodes spread across different GCP regions for global coverage.
You'll need a few things ready:
Terraform (version 1.0.0 or newer)
A running Bacalhau orchestrator node
Google Cloud SDK installed and set up
An active GCP billing account
Your organization ID handy
An SSH key pair for securely accessing your nodes
Make sure you are logged in with GCP. This could involve both of the following commands:
Clone the examples repo to your machine and go into the GCP directory.
Now, make a copy of the example environment file:
Open up env.json
and fill in your GCP details (more on this below!)
Update your config/config.yaml
with your orchestrator information. Specifically, these lines:
L
Let Terraform get everything ready:
Launch your cluster:
The entire process takes about 8 minutes, but should end with something like the below:
You're good to go!
The env.json
file is where all the magic happens. Here's what you'll need to fill in:
bootstrap_project_id
: Your existing GCP project (just used for setup)
base_project_name
: What you want to call your new project
gcp_billing_account_id
: Where the charges should go
gcp_user_email
: Your GCP email address
org_id
: Your organization's ID
app_tag
: A friendly name for your resources (like "bacalhau-demo")
bacalhau_data_dir
: Where job data should be stored
bacalhau_node_dir
: Where node configs should live
username
: Your SSH username
public_key
: Path to your SSH public key
You can set up nodes in different regions with custom configurations:
Once everything's up and running, let's make sure it works!
First, make sure you have the Bacalhau CLI installed. You can read more about installing the CLI here.
First configure the CLI to use your cluster:
Check on the health of your nodes:
If you're using the Expanso Cloud hosted orchestrator (Recommended!), you can look at your nodes on the Expanso Cloud dashboard in real-time.
Run a simple test job:
Check on your jobs:
Get your results:
Having issues? Here are some common solutions:
Double-check your GCP permissions
Make sure your billing account is active
Verify that all needed APIs are turned on in GCP
Look at the logs on a node: journalctl -u bacalhau-startup.service
Check Docker logs on a node: docker logs <container-id>
Make sure that port 4222 isn't blocked
Verify your NATS connection settings
Check if nodes are properly registered
Make sure compute is enabled in your config
When you're done, clean everything up with:
If you need to peek under the hood, here's how:
Find your node IPs:
SSH into a node:
Check on Docker:
Go into the container on the node:
Here's what each important file does in your setup:
main.tf
: Your main Terraform configuration
variables.tf
: Where input variables are defined
outputs.tf
: What information Terraform will show you
config/config.yaml
: How your Bacalhau nodes are configured
scripts/startup.sh
: Gets your nodes ready to run
scripts/bacalhau-startup.service
: Manages the Bacalhau service
cloud-init/init-vm.yml
: Sets up your VM environment, installs packages, and gets services running
config/docker-compose.yml
: Runs Bacalhau in a privileged container with all the right volumes and health checks
The neat thing is that most of your configuration happens in just one file: env.json
. Though if you want to get fancy, there's lots more you can customize!
If you get stuck or have questions:
Open an issue in our GitHub repository
Join our Slack
We're here to help you get your cluster running smoothly! 🌟
Welcome to the guide for setting up your own Bacalhau cluster across multiple Azure regions! This guide will walk you through creating a robust, distributed compute cluster that's perfect for running your Bacalhau workloads.
Think of this as building your own distributed supercomputer! Your cluster will provision compute nodes spread across different Azure regions for global coverage.
You'll need a few things ready:
Terraform (version 1.0.0 or newer)
A running Bacalhau orchestrator node
Azure CLI installed and set up
An active Azure subscription
Your subscription ID handy
An SSH key pair for securely accessing your nodes
First, create a terraform.tfvars.json
file with your Azure details:
Open up terraform.tfvars.json
and fill in your Azure details:
Update your config/config.yaml
with your orchestrator information. Specifically, these lines:
Let Terraform get everything ready:
Launch your cluster:
The infrastructure is organized into modules:
Network: Creates VNets and subnets in each region
Security Group: Sets up NSGs with rules for SSH, HTTP, and NATS
Instance: Provisions VMs with cloud-init configuration
Once everything's up and running, let's make sure it works!
Setup your configuration to point at your orchestrator node:
Check on the health of your nodes:
Run a simple test job:
Check on your jobs:
Get your results:
Having issues? Here are some common solutions:
Double-check your Azure permissions
Make sure your subscription is active
Verify that all needed resource providers are registered
Look at the logs on a node: journalctl -u bacalhau-startup.service
Check Docker logs on a node: docker logs <container-id>
Make sure that port 4222 isn't blocked
Verify your NATS connection settings
Check if nodes are properly registered
Make sure compute is enabled in your config
When you're done, clean everything up with:
If you need to peek under the hood, here's how:
Find your node IPs:
SSH into a node:
Check on Docker:
Go into the container on the node:
Here's what each important file does in your setup:
main.tf
: Your main Terraform configuration
variables.tf
: Where input variables are defined
outputs.tf
: What information Terraform will show you
modules/network
: Handles VNet and subnet creation
modules/securityGroup
: Manages network security groups
modules/instance
: Provisions VMs with cloud-init
cloud-init/init-vm.yml
: Sets up your VM environment, installs packages, and gets services running
config/docker-compose.yml
: Runs Bacalhau in a privileged container with all the right volumes and health checks
For ensuring that you have configured your Azure CLI correctly, here are some commands you can use:
If you get stuck or have questions:
We're here to help you get your cluster running smoothly! 🌟
Welcome to the guide for setting up your own Bacalhau cluster across multiple AWS regions! This guide will walk you through creating a robust, distributed compute cluster that's perfect for running your Bacalhau workloads.
Think of this as building your own distributed supercomputer! Your cluster will provision compute nodes spread across different AWS regions for global coverage.
You'll need a few things ready:
Terraform (version 1.0.0 or newer)
AWS CLI installed and configured
An active AWS account with appropriate permissions
Your AWS credentials configured
An SSH key pair for securely accessing your nodes
A Bacalhau network
First, set up an orchestrator node. We recommend using for this! But you can always set up your own
Create your environment configuration file:
Fill in your AWS details in env.tfvars.json
:
Configure your desired regions in locations.yaml
. Here's an example (we have a full list of these in all_locations.yaml):
Make sure the AMI exists in the region you need it to! You can confirm this by executing the following command:
Update your Bacalhau config/config.yaml (the defaults are mostly fine, just update the Orchestrator, and Token lines):
Deploy your cluster using the Python deployment script:
Terraform on AWS requires switching to different workspaces when deploying to different availability zones. As a result, we had to setup a separate deploy.py
script which switches to each workspace for you under the hood, to make it easier.
env.tfvars.json
: Your main configuration file containing AWS-specific settings`
locations.yaml
: Defines which regions to deploy to and instance configurations
config/config.yaml
: Bacalhau node configuration
app_name
: Name for your cluster resources
app_tag
: Tag for resource management
bacalhau_installation_id
: Unique identifier for your cluster
username
: SSH username for instances
public_key_path
: Path to your SSH public key
private_key_path
: Path to your SSH private key
bacalhau_config_file_path
: Path to the config file for this compute node (should point at the orchestrator and have the right token)
Each region entry requires:
region
: AWS region (e.g., us-west-2)
zone
: Availability zone (e.g., us-west-2a)
instance_type
: EC2 instance type (e.g., t3.medium)
instance_ami
: AMI ID for the region
node_count
: Number of instances to deploy
Once everything's up and running, let's make sure it works!
Configure your Bacalhau client:
List your compute nodes:
Run a test job:
Check job status:
Verify AWS credentials are properly configured:
Check IAM permissions
Ensure you have quota available in target regions
SSH into a node:
Check Bacalhau service logs:
Check Docker container status:
Verify security group rules (ports 22, 80, and 4222 should be open)
Check VPC and subnet configurations
Ensure internet gateway is properly attached
If nodes aren't joining the network:
Check NATS connection string in config.yaml
Verify security group allows port 4222
Ensure nodes can reach the orchestrator
If jobs aren't running:
Check compute is enabled in node config
Verify Docker is running properly
Check available disk space
If deployment fails:
Look for errors in Terraform output
Check AWS service quotas
Verify AMI availability in chosen regions
Remove all resources:
Check node health:
If you get stuck or have questions:
We're here to help you get your cluster running smoothly! 🌟
First, make sure you have the Bacalhau CLI installed. You can read more about installing the CLI .
Open an issue in our
Join our
First, make sure you have the Bacalhau CLI installed. You can read more about installing the CLI .
Open an issue in our
Join our