Bacalhau Docker Image
This example shows you how to run some common client-side Bacalhau tasks using the Bacalhau Docker image.
TD;LR
Running Docker image on Bacalhau
Prerequisites
To get started, you need to install the Bacalhau client, see more information here
!command -v bacalhau >/dev/null 2>&1 || (export BACALHAU_INSTALL_DIR=.; curl -sL https://get.bacalhau.org/install.sh | bash)
path=!echo $PATH
%env PATH=./:{path[0]}
Pull the Docker image
The first step is to pull the Bacalhau Docker image from the Github container registry.
%%bash
docker pull ghcr.io/bacalhau-project/bacalhau:latest
latest: Pulling from bacalhau-project/bacalhau
Digest: sha256:d80f1fe751886a29e0d5ae265a88abbfcd5c59e36247473b66aba93ea24f45aa
Status: Image is up to date for ghcr.io/bacalhau-project/bacalhau:latest
ghcr.io/bacalhau-project/bacalhau:latest
You can also pull a specific version of the image, e.g.:
docker pull ghcr.io/bacalhau-project/bacalhau:v0.3.16
Remember that the "latest" tag is just a string. It doesn't refer to the latest version of the Bacalhau client, it refers to an image that has the "latest" tag. Therefore, if your machine has already downloaded the "latest" image, it won't download it again. To force a download, you can use the --no-cache
flag.
Check version
Check the version of the Bacalhau client you are using.
%%bash
docker run -t ghcr.io/bacalhau-project/bacalhau:latest version
Client Version: v0.3.29
Server Version: v0.3.29
Running a Bacalhau Job
To submit a bi to Bacalhau, we use the bacalhau docker run
command.
%%bash --out job_id
docker run -t ghcr.io/bacalhau-project/bacalhau:latest \
docker run \
--id-only \
--wait \
ubuntu:latest -- \
sh -c 'uname -a && echo "Hello from Docker Bacalhau!"'
In this example, I run an Ubuntu-based job that echo's some stuff.
Structure of the command
--id-only......
: Output only the job idubuntu:latest.
Ubuntu containerghcr.io/bacalhau-project/bacalhau:latest
: Name of the Bacalhau Docker image
When a job is submitted, Bacalhau prints out the related job_id
. We store that in an environment variable so that we can reuse it later on.
env: JOB_ID=738e0b39-8f73-4f01-ab46-245e8479ad65
To print out the content of the Job ID, run the following command:
%%bash
docker run -t ghcr.io/bacalhau-project/bacalhau:latest \
describe $JOB_ID \
| grep -A 2 "stdout: |"
stdout: |
Linux 914f42609298 5.19.0-1022-gcp #24~22.04.1-Ubuntu SMP Sun Apr 23 09:51:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Hello from Docker Bacalhau!
Submit a Job With Output Files
One inconvenience that you'll see is that you'll need to mount directories into the container to access files. This is because the container is running in a separate environment from your host machine. Let's take a look at the example below:
The first part of the example should look familiar, except for the Docker commands.
%%bash --out job_id
docker run -t ghcr.io/bacalhau-project/bacalhau:latest \
docker run \
--id-only \
--wait \
--gpu 1 \
ghcr.io/bacalhau-project/examples/stable-diffusion-gpu:0.0.1 -- \
python main.py --o ./outputs --p "A Docker whale and a cod having a conversation about the state of the ocean"
When a job is submitted, Bacalhau prints out the related job_id
. We store that in an environment variable so that we can reuse it later on.
env: JOB_ID=bd141e1a-0f68-4a20-886f-c2b30c01b614
Checking the State of your Jobs
- Job status: You can check the status of the job using
bacalhau list
.
%%bash
docker run -t ghcr.io/bacalhau-project/bacalhau:latest \
list $JOB_ID \
When it says Completed
, that means the job is done, and we can get the results.
- Job information: You can find out more information about your job by using
bacalhau describe
.
%%bash
docker run -t ghcr.io/bacalhau-project/bacalhau:latest \
describe $JOB_ID \
- Job download: You can download your job results directly by using
bacalhau get
. Alternatively, you can choose to create a directory to store your results. In the command below, we created a directory and downloaded our job output to be stored in that directory.
%%bash
bacalhau get ${JOB_ID} --output-dir result
After the download has finished you should see the following contents in the results directory.
Need Support?
If have questions or need support or guidance, please reach out to the Bacalhau team via Slack (#bacalhau channel)