Bacalhau Docs
GithubSlackBlogEnterprise
v1.6.x
  • Documentation
  • Use Cases
  • CLI & API
  • References
  • Community
v1.6.x
  • Welcome
  • Getting Started
    • How Bacalhau Works
    • Getting Started
      • Step 1: Install the Bacalhau CLI
      • Step 2: Running Your Own Job
      • Step 3: Checking on the Status of Your Job
    • Creating Your Own Bacalhau Network
      • Setting Up a Cluster on Amazon Web Services (AWS) with Terraform 🚀
      • Setting Up a Cluster on Google Cloud Platform (GCP) With Terraform 🚀
      • Setting Up a Cluster on Azure with Terraform 🚀
    • Hardware Setup
    • Container Onboarding
      • Docker Workloads
      • WebAssembly (Wasm) Workloads
  • Setting Up
    • Running Nodes
      • Node Onboarding
      • GPU Installation
      • Job selection policy
      • Access Management
      • Node persistence
      • Configuring Your Input Sources
      • Configuring Transport Level Security
      • Limits and Timeouts
      • Test Network Locally
      • Bacalhau WebUI
      • Private IPFS Network Setup
    • Workload Onboarding
      • Container
        • Docker Workload Onboarding
        • WebAssembly (Wasm) Workloads
        • Bacalhau Docker Image
        • How To Work With Custom Containers in Bacalhau
      • Python
        • Building and Running Custom Python Container
        • Running Pandas on Bacalhau
        • Running a Python Script
        • Running Jupyter Notebooks on Bacalhau
        • Scripting Bacalhau with Python
      • R (language)
        • Building and Running your Custom R Containers on Bacalhau
        • Running a Simple R Script on Bacalhau
      • Run CUDA programs on Bacalhau
      • Running a Prolog Script
      • Reading Data from Multiple S3 Buckets using Bacalhau
      • Running Rust programs as WebAssembly (WASM)
      • Generate Synthetic Data using Sparkov Data Generation technique
    • Networking Instructions
      • Accessing the Internet from Jobs
      • Utilizing NATS.io within Bacalhau
    • GPU Workloads Setup
    • Automatic Update Checking
    • Marketplace Deployments
      • Google Cloud Marketplace
    • Inter-Nodes TLS
  • Guides
    • Configuration Management
    • Write a config.yaml
    • Write a SpecConfig
    • Using Labels and Constraints
  • Examples
    • Table of Contents for Bacalhau Examples
    • Data Engineering
      • Using Bacalhau with DuckDB
      • Ethereum Blockchain Analysis with Ethereum-ETL and Bacalhau
      • Convert CSV To Parquet Or Avro
      • Simple Image Processing
      • Oceanography - Data Conversion
      • Video Processing
      • Bacalhau and BigQuery
    • Data Ingestion
      • Copy Data from URL to Public Storage
      • Pinning Data
      • Running a Job over S3 data
    • Model Inference
      • EasyOCR (Optical Character Recognition) on Bacalhau
      • Running Inference on Dolly 2.0 Model with Hugging Face
      • Speech Recognition using Whisper
      • Stable Diffusion on a GPU
      • Stable Diffusion on a CPU
      • Object Detection with YOLOv5 on Bacalhau
      • Generate Realistic Images using StyleGAN3 and Bacalhau
      • Stable Diffusion Checkpoint Inference
      • Running Inference on a Model stored on S3
    • Model Training
      • Training Pytorch Model with Bacalhau
      • Training Tensorflow Model
      • Stable Diffusion Dreambooth (Finetuning)
    • Molecular Dynamics
      • Running BIDS Apps on Bacalhau
      • Coresets On Bacalhau
      • Genomics Data Generation
      • Gromacs for Analysis
      • Molecular Simulation with OpenMM and Bacalhau
    • Systems Engineering
      • Ad-hoc log query using DuckDB
  • References
    • Jobs Guide
      • Job Specification
        • Job Types
        • Task Specification
          • Engines
            • Docker Engine Specification
            • WebAssembly (WASM) Engine Specification
          • Publishers
            • IPFS Publisher Specification
            • Local Publisher Specification
            • S3 Publisher Specification
          • Sources
            • IPFS Source Specification
            • Local Source Specification
            • S3 Source Specification
            • URL Source Specification
          • Network Specification
          • Input Source Specification
          • Resources Specification
          • ResultPath Specification
        • Constraint Specification
        • Labels Specification
        • Meta Specification
      • Job Templates
      • Queuing & Timeouts
        • Job Queuing
        • Timeouts Specification
      • Job Results
        • State
    • CLI Guide
      • Single CLI commands
        • Agent
          • Agent Overview
          • Agent Alive
          • Agent Node
          • Agent Version
        • Config
          • Config Overview
          • Config Auto-Resources
          • Config Default
          • Config List
          • Config Set
        • Job
          • Job Overview
          • Job Describe
          • Job Executions
          • Job History
          • Job List
          • Job Logs
          • Job Run
          • Job Stop
        • Node
          • Node Overview
          • Node Approve
          • Node Delete
          • Node List
          • Node Describe
          • Node Reject
      • Command Migration
    • API Guide
      • Bacalhau API overview
      • Best Practices
      • Agent Endpoint
      • Orchestrator Endpoint
      • Migration API
    • Node Management
    • Authentication & Authorization
    • Database Integration
    • Debugging
      • Debugging Failed Jobs
      • Debugging Locally
    • Running Locally In Devstack
    • Setting up Dev Environment
  • Help & FAQ
    • Bacalhau FAQs
    • Glossary
    • Release Notes
      • v1.5.0 Release Notes
      • v1.4.0 Release Notes
  • Integrations
    • Apache Airflow Provider for Bacalhau
    • Lilypad
    • Bacalhau Python SDK
    • Observability for WebAssembly Workloads
  • Community
    • Social Media
    • Style Guide
    • Ways to Contribute
Powered by GitBook
LogoLogo

Use Cases

  • Distributed ETL
  • Edge ML
  • Distributed Data Warehousing
  • Fleet Management

About Us

  • Who we are
  • What we value

News & Blog

  • Blog

Get Support

  • Request Enterprise Solutions

Expanso (2025). All Rights Reserved.

On this page
  • Introduction
  • TL;DR
  • Download and Install
  • Configure Bootstrap IPFS Node
  • Configure Client Nodes
  • Configure the IPFS Daemon as systemd Service
  • Configure Bacalhau Nodes
  • Test Configured Networks
  • Create and Pin Sample File
  • Run a Bacalhau Job
  • View and Download Job Results
  • Need Support?​

Was this helpful?

Export as PDF
  1. Setting Up
  2. Running Nodes

Private IPFS Network Setup

Set up private IPFS network

PreviousBacalhau WebUINextWorkload Onboarding

Was this helpful?

Note that Bacalhauv1.4.0 supports IPFS v0.27 and below.

Starting from v.1.5.0 Bacalhau supports latest IPFS versions.

Consider this when selecting versions of Bacalhau and IPFS when setting up your own private network.

Introduction

Support for the embedded node was in v1.4.0 to streamline communication and reduce overhead. Therefore, now in order to use a private IPFS network, it is necessary to create it yourself and then connect to it with nodes. This manual describes how to:

  1. Install and configure IPFS

  2. Create Private IPFS network

  3. Configure your to use the private IPFS network

  4. Pin your data to private IPFS network

TL;DR

  1. Install on all nodes

  2. Install

  3. Initialize Private IPFS network

  4. Connect all nodes to the same private network

  5. Connect Bacalhau network to use private IPFS network

Download and Install

wget https://go.dev/dl/go1.23.0.linux-amd64.tar.gz
  1. Remove any previous Go installation by deleting the /usr/local/go folder (if it exists), then extract the archive you downloaded into /usr/local, creating a fresh Go tree in /usr/local/go:

rm -rf /usr/local/go && tar -C /usr/local -xzf go1.23.0.linux-amd64.tar.gz
  1. Add /usr/local/go/bin to the PATH environment variable. You can do this by adding the following line to your $HOME/.profile or /etc/profile (for a system-wide installation):

export PATH=$PATH:/usr/local/go/bin

Changes made to a profile file may not apply until the next time you log into the system. To apply the changes immediately, just run the shell commands directly or execute them from the profile using a command such as source $HOME/.profile.

  1. Verify that Go is installed correctly by checking its version:

go version
wget https://dist.ipfs.tech/kubo/v0.30.0/kubo_v0.30.0_linux-amd64.tar.gz
tar -xvzf kubo_v0.30.0_linux-amd64.tar.gz
sudo bash kubo/install.sh

Verify that IPFS is installed correctly by checking its version:

ipfs --version

Configure Bootstrap IPFS Node

A bootstrap node is used by client nodes to connect to the private IPFS network. The bootstrap connects clients to other nodes available on the network.

Execute the ipfs init command to initialize an IPFS node:

ipfs init
# example output

generating ED25519 keypair...done
peer identity: 12D3KooWQqr8BLHDUaZvYG59KnrfYJ1PbbzCq3pzfpQ6QrKP5yz7
initializing IPFS node at /home/username/.ipfs

The next step is to generate the swarm key - a cryptographic key that is used to control access to an IPFS network, and export the key into a swarm.key file, located in the ~/ipfs folder.

echo -e "/key/swarm/psk/1.0.0/\n/base16/\n$(tr -dc 'a-f0-9' < /dev/urandom | head -c64)" > ~/.ipfs/swarm.key
# example swarm.key content:

/key/swarm/psk/1.0.0/
/base16/
k51qzi5uqu5dli3yce3powa8pme8yc2mcwc3gpfwh7hzkzrvp5c6l0um99kiw2

Now the default entries of bootstrap nodes should be removed. Execute the command on all nodes:

ipfs bootstrap rm --all

Check that bootstrap config does not contain default values:

ipfs config show | grep Bootstrap
# expected output:

  "Bootstrap": null,

Configure IPFS to listen for incoming connections on specific network addresses and ports, making the IPFS Gateway and API services accessible. Consider changing addresses and ports depending on the specifics of your network.

ipfs config Addresses.Gateway /ip4/0.0.0.0/tcp/8080
ipfs config Addresses.API /ip4/0.0.0.0/tcp/5001

Start the IPFS daemon:

ipfs daemon

Configure Client Nodes

Copy the swarm.key file from the bootstrap node to client nodes into the ~/.ipfs/ folder and initialize IPFS:

ipfs init

Apply same config as on bootstrap node and start the daemon:

ipfs bootstrap rm --all

ipfs config Addresses.Gateway /ip4/0.0.0.0/tcp/8080

ipfs config Addresses.API /ip4/0.0.0.0/tcp/5001

ipfs daemon

Done! Now you can check that private IPFS network works properly:

  1. List peers on the bootstrap node. It should list all connected nodes:

ipfs swarm peers
# example output for single connected node

/ip4/10.0.2.15/tcp/4001/p2p/12D3KooWQqr8BLHDUaZvYG59KnrfYJ1PbbzCq3pzfpQ6QrKP5yz7
  1. Pin some files and check their availability across the network:

# Create a sample text file and pin it
echo “Hello from the private IPFS network!” > sample.txt
# Pin file:
ipfs add sample.txt
# example output:

added QmWQeYip3JuwhDFmkDkx9mXG3p83a3zMFfiMfhjS2Zvnms sample.txt
 25 B / 25 B [=========================================] 100.00%
# Retrieve and display the content of a pinned file
# Execute this on any node of your private network
ipfs cat QmWQeYip3JuwhDFmkDkx9mXG3p83a3zMFfiMfhjS2Zvnms
# expected output:

Hello from the private IPFS network!

Configure the IPFS Daemon as systemd Service

Finally, make the IPFS daemon run at system startup. To do this:

  1. Create new service unit file in the /etc/systemd/system/

sudo nano /etc/systemd/system/ipfs.service
  1. Add following content to the file, replacing /path/to/your/ipfs/executable with the actual path

[Unit]
Description=IPFS Daemon
After=network.target

[Service]
User=username
ExecStart=/path/to/your/ipfs/executable daemon
Restart=on-failure

[Install]
WantedBy=multi-user.target

Use which ipfs command to locate the executable.

Usually path to the executable is /usr/local/bin/ipfs

For security purposes, consider creating a separate user to run the service. In this case, specify its name in the User= line. Without specifying user, the ipfs service will be launched with root, which means that you will need to copy the ipfs binary to the /root directory

  1. Reload and enable the service

sudo systemctl daemon-reload
sudo systemctl enable ipfs
  1. Done! Now reboot the machine to ensure that daemon starts correctly. Use systemctl status ipfs command to check that service is running:

sudo systemctl status ipfs

#example output

● ipfs.service - IPFS Daemon
     Loaded: loaded (/etc/systemd/system/ipfs.service; enabled; preset: enabled)
     Active: active (running) since Wed 2024-09-10 13:24:09 CEST; 16min ago

Configure Bacalhau Nodes

Now to connect your private Bacalhau network to the private IPFS network, the IPFS API address should be specified using the --ipfs-connect flag. It can be found in the ~/.ipfs/api file:

bacalhau serve \
# any other flags
--ipfs-connect /ip4/0.0.0.0/tcp/5001

Done! Now your private Bacalhau network is connected to the private IPFS network!

Test Configured Networks

To verify that everything works correctly:

  1. Pin the file to the private IPFS network

  2. Run the job, which takes the pinned file as input and publishes result to the private IPFS network

  3. View and download job results

Create and Pin Sample File

Create any file and pin it. Use the ipfs add command:

# create file
echo "Hello from private IPFS network!" > file.txt

# pin the file
ipfs add file.txt
# example output:

added QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ file.txt
 33 B / 33 B

Run a Bacalhau Job

Run a simple job, which fetches the pinned file via its CID, lists its content and publishes results back into the private IPFS network:

bacalhau docker run \
-i ipfs://QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ
--publisher ipfs \
alpine cat inputs
# example output

Job successfully submitted. Job ID: j-c6514250-2e97-4fb6-a1e6-6a5a8e8ba6aa
Checking job status... (Enter Ctrl+C to exit at any time, your job will continue running):

 TIME          EXEC. ID    TOPIC            EVENT         
 15:54:35.767              Submission       Job submitted 
 15:54:35.780  e-a498daaf  Scheduling       Requested execution on n-0f29f45c 
 15:54:35.859  e-a498daaf  Execution        Running 
 15:54:36.707  e-a498daaf  Execution        Completed successfully 
                                             
To get more details about the run, execute:
	bacalhau job describe j-c6514250-2e97-4fb6-a1e6-6a5a8e8ba6aa

To get more details about the run executions, execute:
	bacalhau job executions j-c6514250-2e97-4fb6-a1e6-6a5a8e8ba6aa

To download the results, execute:
	bacalhau job get j-c6514250-2e97-4fb6-a1e6-6a5a8e8ba6aa

View and Download Job Results

bacalhau job describe j-c6514250-2e97-4fb6-a1e6-6a5a8e8ba6aa
# example output (was truncated for brevity)

...
Standard Output
Hello from private IPFS network!
bacalhau job get j-c6514250-2e97-4fb6-a1e6-6a5a8e8ba6aa
# example output

Fetching results of job 'j-c6514250-2e97-4fb6-a1e6-6a5a8e8ba6aa'...
No supported downloader found for the published results. You will have to download the results differently.
[
    {
        "Type": "ipfs",
        "Params": {
            "CID": "QmSskRNnbbw8rNtkLdcJrUS2uC2mhiKofVJsahKRPgbGGj"
        }
    }
]

Use the ipfs ls command to view the results:

ipfs ls QmSskRNnbbw8rNtkLdcJrUS2uC2mhiKofVJsahKRPgbGGj
# example output

QmS6mcrMTFsZnT3wAptqEb8NpBPnv1H6WwZBMzEjT8SSDv 1  exitCode
QmbFMke1KXqnYyBBWxB74N4c5SBnJMVAiMNRcGu6x1AwQH 0  stderr
QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ 33 stdout

Use the ipfs cat command to view the file content. In our case, the file of interest is the stdout:

ipfs cat QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ
# example output

Hello from private IPFS network!

Use the ipfs get command to download the file using its CID:

ipfs get --output stdout QmWQK2Rz4Ng1RPFPyiHECvQGrJb5ZbSwjpLeuWpDuCZAbQ
# example output
Saving file(s) to stdout
 33 B / 33 B [===============================================] 100.00% 0s

In this manual (the earliest and most widely used implementation of IPFS) will be used, so first of all, should be installed.

See the page for latest Go version.

The next step is to download and install Kubo. the appropriate version for your system. It is recommended to use the latest stable version.

Use command to view job execution results:

Use command to download job results. In this particular case, ipfs publisher was used, so the get command will print the CID of the job results:

Need Support?

For questions and feedback, please reach out in our

Kubo
Go
Go Downloads
Select and download
bacalhau job describe
​
Slack
IPFS
Bacalhau network
Go
IPFS
bacalhau job get
discontinued