Skip to main content

Bacalhau Python SDK 🐍

This is the official Python SDK for Bacalhau, named bacalhau-sdk.

It is a high-level SDK that ships the client-side logic (e.g. signing requests) needed to query the endpoints. Please take a look at the examples for snippets to create, list and inspect jobs. Under the hood, this SDK uses bacalhau-apiclient (autogenerated via Swagger/OpenAPI) to interact with the API.

Please make sure to use this SDK library in your Python projects, instead of the lower level bacalhau-apiclient. The latter is listed as a dependency of this SDK and will be installed automatically when you follow the installation instructions below.

Features

  • List, create and inspect Bacalhau jobs using Python objects 🎈
  • Use the production network, or set the following environment variables to target any Bacalhau network out there:
    • BACALHAU_API_HOST
    • BACALHAU_API_PORT
  • Generate a key pair used to sign requests stored in the path specified by the BACALHAU_DIR env var (default: ~/.bacalhau)

Install

From PyPi:

$ pip install bacalhau-sdk

From source:

Clone the public repository:

$ git clone https://github.com/bacalhau-project/bacalhau/

Once you have a copy of the source, you can install it with:

$ cd python/
$ pip install .

Initialize

Likewise the Bacalhau CLI, this SDK uses a key pair to be stored in BACALHAU_DIR used for signing requests. If a key pair is not found there, it will create one for you.

Example Use

Let's submit a Hello World job and then fetch its output data's CID. We start by importing this sdk, namely bacalhau_sdk, used to create and submit a job create request. Then we import bacalhau_apiclient (installed automatically with this sdk), it provides various object models that compose a job create request. These are used to populate a simple python dictionary that will be passed over to the submit util method.

import pprint

from bacalhau_sdk.api import submit
from bacalhau_sdk.config import get_client_id
from bacalhau_apiclient.models.storage_spec import StorageSpec
from bacalhau_apiclient.models.spec import Spec
from bacalhau_apiclient.models.job_spec_language import JobSpecLanguage
from bacalhau_apiclient.models.job_spec_docker import JobSpecDocker
from bacalhau_apiclient.models.job_sharding_config import JobShardingConfig
from bacalhau_apiclient.models.job_execution_plan import JobExecutionPlan
from bacalhau_apiclient.models.publisher_spec import PublisherSpec
from bacalhau_apiclient.models.deal import Deal


data = dict(
APIVersion='V1beta1',
ClientID=get_client_id(),
Spec=Spec(
engine="Docker",
verifier="Noop",
publisher_spec=PublisherSpec(type="IPFS"),
docker=JobSpecDocker(
image="ubuntu",
entrypoint=["echo", "Hello World!"],
),
language=JobSpecLanguage(job_context=None),
wasm=None,
resources=None,
timeout=1800,
outputs=[
StorageSpec(
storage_source="IPFS",
name="outputs",
path="/outputs",
)
],
sharding=JobShardingConfig(
batch_size=1,
glob_pattern_base_path="/inputs",
),
execution_plan=JobExecutionPlan(shards_total=0),
deal=Deal(concurrency=1, confidence=0, min_bids=0),
do_not_track=False,
),
)

pprint.pprint(submit(data))

The script above prints the following object, the job.metadata.id value is our newly created job id!

{'job': {'api_version': 'V1beta1',
'metadata': {'client_id': 'bae9c3b2adfa04cc647a2457e8c0c605cef8ed93bdea5ac5f19f94219f722dfe',
'created_at': '2023-02-01T19:30:21.405209538Z',
'id': '710a0bc2-81d1-4025-8f80-5327ca3ce170'},
'spec': {'Deal': {'Concurrency': 1},
'Docker': {'Entrypoint': ['echo', 'Hello World!'],
'Image': 'ubuntu'},
'Engine': 'Docker',
'ExecutionPlan': {'ShardsTotal': 1},
'Language': {'JobContext': {}},
'Network': {'Type': 'None'},
'Publisher': 'IPFS',
'Resources': {'GPU': ''},
'Sharding': {'BatchSize': 1,
'GlobPatternBasePath': '/inputs'},
'Timeout': 1800,
'Verifier': 'Noop',
'Wasm': {'EntryModule': {}},
'outputs': [{'Name': 'outputs',
'StorageSource': 'IPFS',
'path': '/outputs'}]},
'status': {'JobState': {},
'Requester': {'RequesterNodeID': 'QmdZQ7ZbhnvWY1J12XYKGHApJ6aufKyLNSvf8jZBrBaAVL',
'RequesterPublicKey': 'CAASpgIwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDVRKPgCfY2fgfrkHkFjeWcqno+MDpmp8DgVaY672BqJl/dZFNU9lBg2P8Znh8OTtHPPBUBk566vU3KchjW7m3uK4OudXrYEfSfEPnCGmL6GuLiZjLf+eXGEez7qPaoYqo06gD8ROdD8VVse27E96LlrpD1xKshHhqQTxKoq1y6Rx4DpbkSt966BumovWJ70w+Nt9ZkPPydRCxVnyWS1khECFQxp5Ep3NbbKtxHNX5HeULzXN5q0EQO39UN6iBhiI34eZkH7PoAm3Vk5xns//FjTAvQw6wZUu8LwvZTaihs+upx2zZysq6CEBKoeNZqed9+Tf+qHow0P5pxmiu+or+DAgMBAAE='}}}}

We can then use the results method to fetch, among other fields, the output data's CID.

from bacalhau_sdk.api import results

print(results(job_id="710a0bc2-81d1-4025-8f80-5327ca3ce170"))

The line above prints the following dictionary:

{'results': [{'data': {'cid': 'QmYEqqNDdDrsRhPRShKHzsnZwBq3F59Ti3kQmv9En4i5Sw',
'metadata': None,
'name': 'job-710a0bc2-81d1-4025-8f80-5327ca3ce170-shard-0-host-QmYgxZiySj3MRkwLSL4X2MF5F9f2PMhAE3LV49XkfNL1o3',
'path': None,
'source_path': None,
'storage_source': 'IPFS',
'url': None},
'node_id': 'QmYgxZiySj3MRkwLSL4X2MF5F9f2PMhAE3LV49XkfNL1o3',
'shard_index': None}]}

Congrats, that was a good start! 🎈 Please find more code snippets in the examples folder (more examples published in the near future).

Devstack

You can set the environment variables BACALHAU_API_HOST and BACALHAU_API_PORT to point this SDK to your Bacalhau API local devstack

Developers guide

We use Poetry to manage this package, take a look at their official docs to install it. Note, all targets in the Makefile use poetry as well!

To develop this SDK locally, create a dedicated poetry virtual environment and install the root package (i.e. bacalhau_sdk) and its dependencies:

$ poetry install --no-interaction --with test,dev -vvv
Creating virtualenv bacalhau-sdk-9mIcLX8U-py3.9 in /Users/enricorotundo/Library/Caches/pypoetry/virtualenvs
Using virtualenv: /Users/enricorotundo/Library/Caches/pypoetry/virtualenvs/bacalhau-sdk-9mIcLX8U-py3.9
Installing dependencies from lock file
...

Note the line above installs the root package (i.e. bacalhau_sdk) in editable mode, that is, any change to its source code is reflected immediately without the need for re-packaging and re-installing it. Easy peasy!

Then install the pre-commit hooks and test it:

$ make install-pre-commit

$ make pre-commit