This is the official Python SDK for Bacalhau, named bacalhau-sdk.
Introduction
It is a high-level SDK that ships the client-side logic (e.g. signing requests) needed to query the endpoints. Please take a look at the examples for snippets to create, list and inspect jobs. Under the hood, this SDK uses bacalhau-apiclient
(autogenerated via Swagger /OpenAPI) to interact with the API.
Please make sure to use this SDK library in your Python projects, instead of the lower level bacalhau-apiclient
. The latter is listed as a dependency of this SDK and will be installed automatically when you follow the installation instructions below.
List, create and inspect Bacalhau jobs using Python objects
Use the production network, or set the following environment variables to target any Bacalhau network out there:
Generate a key pair used to sign requests stored in the path specified by the BACALHAU_DIR
env var (default: ~/.bacalhau
)
Copy pip install bacalhau-sdk
Clone the public repository:
Copy git clone https://github.com/bacalhau-project/bacalhau/
Once you have a copy of the source, you can install it with:
Copy cd python/
pip install .
Likewise the Bacalhau CLI, this SDK uses a key pair to be stored in BACALHAU_DIR
used for signing requests. If a key pair is not found there, it will create one for you.
Let's submit a Hello World job and then fetch its output data's CID. We start by importing this sdk, namely bacalhau_sdk
, used to create and submit a job create request. Then we import bacalhau_apiclient
(installed automatically with this sdk), it provides various object models that compose a job create request. These are used to populate a simple python dictionary that will be passed over to the submit
util method.
Copy import pprint
from bacalhau_sdk . api import submit
from bacalhau_sdk . config import get_client_id
from bacalhau_apiclient . models . storage_spec import StorageSpec
from bacalhau_apiclient . models . spec import Spec
from bacalhau_apiclient . models . job_spec_language import JobSpecLanguage
from bacalhau_apiclient . models . job_spec_docker import JobSpecDocker
from bacalhau_apiclient . models . job_sharding_config import JobShardingConfig
from bacalhau_apiclient . models . job_execution_plan import JobExecutionPlan
from bacalhau_apiclient . models . publisher_spec import PublisherSpec
from bacalhau_apiclient . models . deal import Deal
data = dict (
APIVersion = 'V1beta1' ,
ClientID = get_client_id (),
Spec = Spec (
engine = "Docker" ,
verifier = "Noop" ,
publisher_spec = PublisherSpec (type = "IPFS" ),
docker = JobSpecDocker (
image = "ubuntu" ,
entrypoint = [ "echo" , "Hello World!" ],
),
language = JobSpecLanguage (job_context = None ),
wasm = None ,
resources = None ,
timeout = 1800 ,
outputs = [
StorageSpec (
storage_source = "IPFS" ,
name = "outputs" ,
path = "/outputs" ,
)
],
sharding = JobShardingConfig (
batch_size = 1 ,
glob_pattern_base_path = "/inputs" ,
),
execution_plan = JobExecutionPlan (shards_total = 0 ),
deal = Deal (concurrency = 1 , confidence = 0 , min_bids = 0 ),
do_not_track = False ,
),
)
pprint . pprint ( submit (data))
The script above prints the following object, the job.metadata.id
value is our newly created job id!
Copy {'job' : {'api_version' : 'V 1 beta 1 ' ,
'metadata' : {'client_id' : 'bae 9 c 3 b 2 adfa 04 cc 647 a 2457e8 c 0 c 605 cef 8 ed 93 bdea 5 ac 5 f 19 f 94219 f 722 dfe' ,
'created_at' : ' 2023-02-01 T 19 : 30 : 21.405209538 Z' ,
'id' : ' 710 a 0 bc 2-81 d 1-4025-8 f 80-5327 ca 3 ce 170 '} ,
'spec' : {'Deal' : {'Concurrency' : 1 } ,
'Docker' : {'Entrypoint' : ['echo' , 'Hello World!'] ,
'Image' : 'ubuntu'} ,
'Engine' : 'Docker' ,
'ExecutionPlan' : {'ShardsTotal' : 1 } ,
'Language' : {'JobContext' : {}} ,
'Network' : {'Type' : 'None'} ,
'Publisher' : 'IPFS' ,
'Resources' : {'GPU' : ''} ,
'Sharding' : {'BatchSize' : 1 ,
'GlobPatternBasePath' : '/inputs'} ,
'Timeout' : 1800 ,
'Verifier' : 'Noop' ,
'Wasm' : {'EntryModule' : {}} ,
'outputs' : [{'Name' : 'outputs' ,
'StorageSource' : 'IPFS' ,
'path' : '/outputs'}]} ,
'status' : {'JobState' : {} ,
'Requester' : {'RequesterNodeID' : 'QmdZQ 7 ZbhnvWY 1 J 12 XYKGHApJ 6 aufKyLNSvf 8 jZBrBaAVL' ,
'RequesterPublicKey': 'CAASpgIwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDVRKPgCfY2fgfrkHkFjeWcqno+MDpmp8DgVaY672BqJl/dZFNU9lBg2P8Znh8OTtHPPBUBk566vU3KchjW7m3uK4OudXrYEfSfEPnCGmL6GuLiZjLf+eXGEez7qPaoYqo06gD8ROdD8VVse27E96LlrpD1xKshHhqQTxKoq1y6Rx4DpbkSt966BumovWJ70w+Nt9ZkPPydRCxVnyWS1khECFQxp5Ep3NbbKtxHNX5HeULzXN5q0EQO39UN6iBhiI34eZkH7PoAm3Vk5xns//FjTAvQw6wZUu8LwvZTaihs+upx2zZysq6CEBKoeNZqed9+Tf+qHow0P5pxmiu+or+DAgMBAAE='}}}}
We can then use the results
method to fetch, among other fields, the output data's CID.
Copy from bacalhau_sdk . api import results
print ( results (job_id = "710a0bc2-81d1-4025-8f80-5327ca3ce170" ))
The line above prints the following dictionary:
Copy {'results' : [{'data' : {'cid' : 'QmYEqqNDdDrsRhPRShKHzsnZwBq 3 F 59 Ti 3 kQmv 9 En 4 i 5 Sw' ,
'metadata' : None ,
'name': 'job-710a0bc2-81d1-4025-8f80-5327ca3ce170-shard-0-host-QmYgxZiySj3MRkwLSL4X2MF5F9f2PMhAE3LV49XkfNL1o3',
'path' : None ,
'source_path' : None ,
'storage_source' : 'IPFS' ,
'url' : None} ,
'node_id' : 'QmYgxZiySj 3 MRkwLSL 4 X 2 MF 5 F 9 f 2 PMhAE 3 LV 49 XkfNL 1 o 3 ' ,
'shard_index' : None}]}
Congrats, that was a good start! Please find more code snippets in the examples folder .
You can set the environment variables BACALHAU_API_HOST
and BACALHAU_API_PORT
to point this SDK to your Bacalhau API local devstack.
We use Poetry to manage this package, take a look at their official docs to install it. Note, all targets in the Makefile use poetry as well!
To develop this SDK locally, create a dedicated poetry virtual environment and install the root package (i.e. bacalhau_sdk
) and its dependencies:
Copy poetry install --no-interaction --with test,dev -vvv
Creating virtualenv bacalhau-sdk-9mIcLX8U-py3.9 in /Users/enricorotundo/Library/Caches/pypoetry/virtualenvs
Using virtualenv: /Users/enricorotundo/Library/Caches/pypoetry/virtualenvs/bacalhau-sdk-9mIcLX8U-py3.9
Installing dependencies from lock file
...
Note the line above installs the root package (i.e. bacalhau_sdk
) in editable mode, that is, any change to its source code is reflected immediately without the need for re-packaging and re-installing it. Easy-peasy!
Then install the pre-commit hooks and test it:
Copy make install-pre-commit