Bacalhau Python SDK

This is the official Python SDK for Bacalhau, named bacalhau-sdk.

The Bacalhau SDK changed with Bacalhau v.1.4.0 and has added/changed functionality!

Introduction

It is a high-level SDK that ships the client-side logic (e.g. signing requests) needed to query the endpoints. Please take a look at the examples for snippets to create, list and inspect jobs. Under the hood, this SDK uses bacalhau-apiclient (autogenerated via Swagger/OpenAPI) to interact with the API.

Please make sure to use this SDK library in your Python projects, instead of the lower level bacalhau-apiclient. The latter is listed as a dependency of this SDK and will be installed automatically when you follow the installation instructions below.

Features​

  1. List, create and inspect Bacalhau jobs using Python objects

  2. Use the production network, or set the following environment variables to target any Bacalhau network out there:

    1. BACALHAU_API_HOST

    2. BACALHAU_API_PORT

  3. Generate a key pair used to sign requests stored in the path specified by the BACALHAU_DIR env var (default: ~/.bacalhau)

Install​

pip install bacalhau-sdk

Initialize​

Likewise the Bacalhau CLI, this SDK uses a key pair to be stored in BACALHAU_DIR used for signing requests. If a key pair is not found there, it will create one for you.

Example Use​

Let's submit a Hello World job and then fetch its output data's CID. We start by importing this sdk, namely bacalhau_sdk, used to create and submit a job create request. Then we import bacalhau_apiclient (installed automatically with this sdk), it provides various object models that compose a job create request. These are used to populate a simple python dictionary that will be passed over to the submit util method.

You have to set your API keys for the requestor node in the Environment variables first! These are stored as

"BACALHAU_API_HOST" = ...
"BACALHAU_API_PORT" = ...
import pprint
from bacalhau_apiclient.models.job import Job
from bacalhau_apiclient.models.task import Task
from bacalhau_apiclient.models.all_of_execution_published_result import SpecConfig
from bacalhau_apiclient.models.api_put_job_request import (
    ApiPutJobRequest as PutJobRequest,
)
from bacalhau_sdk.jobs import Jobs

# Define the task
task = Task(
    name="My Main task",
    engine=SpecConfig(
        type="docker",
        params=dict(
            Image="ubuntu:latest",
            Entrypoint=["/bin/bash"],
            Parameters=["-c", "echo Hello World"],
        ),
    ),
    publisher=SpecConfig(type="IPFS", params=dict()),
)

# Define the job
job = Job(
    name="A Simple Docker Job",
    type="batch",
    count=1,
    tasks=[task]
)

# Create the job request
put_job_request = PutJobRequest(job=job)

# Instantiate the Jobs client
jobs = Jobs()

# Submit the job
put_job_response = jobs.put(put_job_request)

# Print the response
pprint.pprint(put_job_response)

The script above prints the following object, the job.metadata.id value is our newly created job id!

{'evaluation_id': '03e89a4d-ee70-4a85-92fc-bbde753ef4d1',
 'job_id': 'j-868c1aee-1d6c-43c6-aeda-78ccf9e894a4',
 'warnings': None}

We can then use the results method to fetch, among other fields, the output data's CID. Please extract your own job_id from the above output and hand it over to the results function.

jobs_instance = Jobs()
results_response = jobs_instance.results(job_id=job_id)
print(results_response)

The line above prints the following dictionary:

{'items': [{'params': {'CID': 'QmSjnM3vNcD34jrwTDTcg2B8oZAHrZ5iAupJKuEcD9AURE'},
            'type': 'ipfs'}],
 'next_token': ''}

Congrats, that was a good start! Please find more code snippets in the examples folder.

When there wasn't some config specs specified, you may get messages about the config debugger working on them. This can look as the following:

DEBUG:bacalhau_sdk.config:BACALHAU_DIR not set, using default of ~/.bacalhau 
DEBUG:bacalhau_sdk.config:Using config dir: /root/.bacalhau 
DEBUG:bacalhau_sdk.config:config_dir: /root/.bacalhau 
DEBUG:bacalhau_sdk.config:Host is set to: http://bootstrap.production.bacalhau.org:1234
DEBUG:bacalhau_sdk.config:init config done

Available Functions

Devstack​

You can set the environment variables BACALHAU_API_HOST and BACALHAU_API_PORT to point this SDK to your Bacalhau API local devstack.

Developers guide​

We use Poetry to manage this package, take a look at their official docs to install it. Note, all targets in the Makefile use poetry as well!

To develop this SDK locally, create a dedicated poetry virtual environment and install the root package (i.e. bacalhau_sdk) and its dependencies:

poetry install --no-interaction --with test,dev -vvv

This outputs the following:

Creating virtualenv bacalhau-sdk-9mIcLX8U-py3.9 in /Users/enricorotundo/Library/Caches/pypoetry/virtualenvs
Using virtualenv: /Users/enricorotundo/Library/Caches/pypoetry/virtualenvs/bacalhau-sdk-9mIcLX8U-py3.9
Installing dependencies from lock file
...

Note the line above installs the root package (i.e. bacalhau_sdk) in editable mode, that is, any change to its source code is reflected immediately without the need for re-packaging and re-installing it. Easy-peasy!

Then install the pre-commit hooks and test it:

make install-pre-commit
make pre-commit