Setting Up a Cluster on Azure with Terraform
Last updated
Last updated
News & Blog
BlogGet Support
Request Enterprise SolutionsExpanso (2024). All Rights Reserved.
Welcome to the guide for setting up your own Bacalhau cluster across multiple Azure regions! This guide will walk you through creating a robust, distributed compute cluster that's perfect for running your Bacalhau workloads.
Think of this as building your own distributed supercomputer! Your cluster will provision compute nodes spread across different Azure regions for global coverage.
You'll need a few things ready:
Terraform (version 1.0.0 or newer)
A running Bacalhau orchestrator node
Azure CLI installed and set up
An active Azure subscription
Your subscription ID handy
An SSH key pair for securely accessing your nodes
First, create a terraform.tfvars.json
file with your Azure details:
Open up terraform.tfvars.json
and fill in your Azure details:
Update your config/config.yaml
with your orchestrator information. Specifically, these lines:
Let Terraform get everything ready:
Launch your cluster:
The infrastructure is organized into modules:
Network: Creates VNets and subnets in each region
Security Group: Sets up NSGs with rules for SSH, HTTP, and NATS
Instance: Provisions VMs with cloud-init configuration
Once everything's up and running, let's make sure it works!
First, make sure you have the Bacalhau CLI installed. You can read more about installing the CLI here.
Setup your configuration to point at your orchestrator node:
Check on the health of your nodes:
Run a simple test job:
Check on your jobs:
Get your results:
Having issues? Here are some common solutions:
Double-check your Azure permissions
Make sure your subscription is active
Verify that all needed resource providers are registered
Look at the logs on a node: journalctl -u bacalhau-startup.service
Check Docker logs on a node: docker logs <container-id>
Make sure that port 4222 isn't blocked
Verify your NATS connection settings
Check if nodes are properly registered
Make sure compute is enabled in your config
When you're done, clean everything up with:
If you need to peek under the hood, here's how:
Find your node IPs:
SSH into a node:
Check on Docker:
Go into the container on the node:
Here's what each important file does in your setup:
main.tf
: Your main Terraform configuration
variables.tf
: Where input variables are defined
outputs.tf
: What information Terraform will show you
modules/network
: Handles VNet and subnet creation
modules/securityGroup
: Manages network security groups
modules/instance
: Provisions VMs with cloud-init
cloud-init/init-vm.yml
: Sets up your VM environment, installs packages, and gets services running
config/docker-compose.yml
: Runs Bacalhau in a privileged container with all the right volumes and health checks
For ensuring that you have configured your Azure CLI correctly, here are some commands you can use:
If you get stuck or have questions:
Check out the official Bacalhau Documentation
Open an issue in our GitHub repository
Join our Slack
We're here to help you get your cluster running smoothly! 🌟
We recommend using Expanso Cloud to create your network! But if you'd like to set up a cluster on your own, you can use our tool Andaime to do this too.
If you have any questions about the platform - please contact us on Slack or Email us!