Welcome
Welcome to the Bacalhau documentation!
What is Bacalhau?
Bacalhau is a platform designed for fast, cost-efficient, and secure computation by running jobs directly where the data is generated and stored. Bacalhau helps you streamline existing workflows without extensive rewriting, as it allows you to run arbitrary Docker containers and WebAssembly (WASM) images as tasks. This approach is also known as Compute Over Data (or CoD). The name Bacalhau comes from the Portuguese word for salted cod fish.
Bacalhau aims to revolutionize data processing for large-scale datasets by enhancing cost-efficiency and accessibility, making data processing available to a broader audience. Our goal is to build an open, collaborative compute ecosystem that fosters unmatched collaboration. At (Expanso.io), we offer a demo network where you can try running jobs without any installation. Give it a try!
Why Bacalhau?
Bacalhau simplifies the process of managing compute jobs by providing a unified platform for managing jobs across different regions, clouds, and edge devices.
How it works
Bacalhau consists of a network of nodes that enables orchestration between every compute resource, no matter whether it is a Cloud VM, an On-premise server, or Edge devices. The network consists of two types of nodes:
Requester Node: responsible for handling user requests, discovering and ranking compute nodes, forwarding jobs to compute nodes, and monitoring the job lifecycle.
Compute Node: responsible for executing jobs and producing results. Different compute nodes can be used for different types of jobs, depending on their capabilities and resources.
Use Cases
Bacalhau can be used for a variety of data processing workloads, including machine learning, data analytics, and scientific computing. It is well-suited for workloads that require processing large amounts of data in a distributed and parallelized manner.
Once you have more than 10 devices generating or storing around 100GB of data, you're likely to face challenges with processing that data efficiently. Traditional computing approaches may struggle to handle such large volumes, and that's where distributed computing solutions like Bacalhau can be extremely useful. Bacalhau can be used in various industries, including security, web serving, financial services, IoT, Edge, Fog, and multi-cloud. Bacalhau shines when it comes to data-intensive applications like data engineering, model training, model inference, molecular dynamics, etc.
Community
Bacalhau has a very friendly community and we are always happy to help you get started:
GitHub Discussions – ask anything about the project, give feedback, or answer questions that will help other users.
Join the Slack Community and go to #bacalhau channel – it is the easiest way to engage with other members in the community and get help.
Contributing – learn how to contribute to the Bacalhau project.
Next Steps
👉 Continue with Bacalhau's Getting Started guide to learn how to install and run a job with the Bacalhau client.
Last updated
Was this helpful?