Getting Started with Bacalhau
In this tutorial, you'll learn how to install and run a job with the Bacalhau client using the Bacalhau CLI or Docker.
The Bacalhau Client
The Bacalhau client is a command-line interface (CLI) that allows you to submit jobs to the Bacalhau network. The Bacalhau client is available for Linux, macOS, and Windows. You can also run the Bacalhau client in a Docker container.
Install the Bacalhau CLI
You can install or update the Bacalhau CLI or pull a Docker image by running the commands in a terminal.
You may need sudo mode or root password to install the local Bacalhau binary to /usr/local/bin
:
Using the CLI: Windows users can download the latest release tarball from Github and extract bacalhau.exe
to anywhere on the PATH.
To run a specific version of Bacalhau using Docker, use the command docker run -it ghcr.io/bacalhau-project/bacalhau:v0.3.23, where "v0.3.23" is the version you want to run; note that the "latest" tag will not re-download the image if you have an older version. For more information on running the Docker image, check out the Bacalhau docker image example.
- CLI
- Docker
curl -sL https://get.bacalhau.org/install.sh | bash
docker image rm -f ghcr.io/bacalhau-project/bacalhau:latest # Remove old image if it exists
docker pull ghcr.io/bacalhau-project/bacalhau:latest
Verify the Installation
To run and Bacalhau client command with Docker, prefix it with docker run ghcr.io/bacalhau-project/bacalhau:latest
.
To verify installation and check the version of the client and server, use the version
command, you can run the command:
- CLI
- Docker
bacalhau version
docker run -it ghcr.io/bacalhau-project/bacalhau:latest version
If you're wondering which server is being used, the Bacalhau Project has a public Bacalhau server network that's shared with the community. This server allows you to launch your jobs from your computer without maintaining a compute cluster on your own.
Let's submit a Hello World job
To submit a job in Bacalhau, we will use the bacalhau docker run
command. Let's take a quick look at its syntax:
bacalhau docker run [FLAGS] IMAGE[:TAG] [COMMAND]
The command below submits a Hello World job that runs an echo program within an Ubuntu container:
- CLI
- Docker
bacalhau docker run ubuntu echo Hello World
%%bash --out job_id
docker run -t ghcr.io/bacalhau-project/bacalhau:latest \
docker run \
--id-only \
--wait \
ubuntu:latest -- \
sh -c 'uname -a && echo "Hello from Docker Bacalhau!"'
While this command is designed to resemble Docker's run command which you may be familiar with, Bacalhau introduces a whole new set of flags (see CLI Reference) to support its computing model.
After the above command is run, the job is submitted to the public network, which processes the job and Bacalhau prints out the related job id:
Job successfully submitted. Job ID: 3b39baee-5714-4f17-aa71-1f5824665ad6
Checking job status...
The job_id
above is shown in its full form. For convenience, you can use the shortened version, in this case: 3b39baee
. For ease, we store the job_id
in an environment variable.
$ export JOB_ID=3b39baee # make sure to use the right job id from the docker run command
Checking the State of your Jobs
- Job status: You can check the status of the job using
bacalhau list
.
bacalhau list --id-filter ${JOB_ID}
When it says Completed
, that means the job is done, and we can get the results.
CREATED ID JOB STATE VERIFIED COMPLETED
07:20:32 3b39baee Docker ubuntu echo H... Published /ipfs/bafybeidu4zm6w...
For a comprehensive list of flags you can pass to the list command check out the related CLI Reference page.
- Job information: You can find out more information about your job by using
bacalhau describe
.
bacalhau describe ${JOB_ID}
- Job download: You can download your job results directly by using
bacalhau get
. Alternatively, you can choose to create a directory to store your results. In the command below, we created a directory calledmyfolder
and download our job output to be stored in that directory.
$ cd Downloads
$ mkdir -p /tmp/myfolder
$ cd /tmp/myfolder
bacalhau get $JOB_ID
After the download has finished you should see the following contents in results directory
Viewing your Job Output
Each job creates 3 subfolders: the combined_results, per_shard files, and the raw directory. To view the file, run the following command:
$ cat /tmp/myfolder/job-id/stdout
That should print out the string Hello World
.
With that, you have just successfully run a job on the Bacalhau network! 🐟
Where to go next?
Here are a few resources that provides a deeper dive into running jobs with Bacalhau:
- How to run an existing workload on Bacalhau
- Walk through a more data intensive demo
- Check out the Bacalhau CLI Reference page
Need Support?
If have questions or need support or guidance, please reach out to the Bacalhau team via Slack (#bacalhau channel)