Skip to main content

Running a Simple R Script in Bacalhau

stars - badge-generator

You can use official Docker containers for each language like R or Python. In this example, we will use the official R container and run it on Bacalhau.

TD;LR

A quick guide on how to run a hello world script on Bacalhau

Prerequisites

To get started, you need to install the Bacalhau client, see more information here

Running an R Script Locally

To install R follow these instructions A Installing R and RStudio | Hands-On Programming with R. After R and RStudio is installed, create and run a script called hello.R.

%%writefile hello.R
print("hello world")

Run the script:

%%bash
Rscript hello.R

Next, upload the script to your public storage in our case IPFS. We've already uploaded the script to IPFS and the CID is: QmVHSWhAL7fNkRiHfoEJGeMYjaYZUsKHvix7L54SptR8ie. You can look at this by browsing to one of the HTTP IPFS proxies like ipfs.io or w3s.link.

Running a Job on Bacalhau

Now it's time to run the script on the Bacalhau network. To run a job on Bacalhau, run the following command:

%%bash --out job_id
bacalhau docker run \
--wait \
--id-only \
-i ipfs://QmQRVx3gXVLaRXywgwo8GCTQ63fHqWV88FiwEqCidmUGhk:/hello.R \
r-base \
-- Rscript hello.R

Structure of the command

Let's look closely at the command above:

  • bacalhau docker run: call to bacalhau

  • -i ipfs://QmQRVx3gXVLaRXywgwo8GCTQ63fHqWV88FiwEqCidmUGhk: CIDs to use on the job. Mounts them at '/inputs' in the execution.

  • :/hello.R: the name and the tag of the docker image we are using

  • Rscript hello.R: execute the R script

When a job is submitted, Bacalhau prints out the related job_id. We store that in an environment variable so that we can reuse it later on.

%env JOB_ID={job_id}

Checking the State of your Jobs

  • Job status: You can check the status of the job using bacalhau list.
%%bash
bacalhau list --id-filter ${JOB_ID}

When it says Published or Completed, that means the job is done, and we can get the results.

  • Job information: You can find out more information about your job by using bacalhau describe.
%%bash
bacalhau describe ${JOB_ID}
  • Job download: You can download your job results directly by using bacalhau get. Alternatively, you can choose to create a directory to store your results. In the command below, we created a directory and downloaded our job output to be stored in that directory.
%%bash
rm -rf results && mkdir results
bacalhau get ${JOB_ID} --output-dir results

Viewing your Job Output

To view the file, run the following command:

%%bash
ls results/

Viewing the result

%%bash
cat results/stdout

Futureproofing your R Scripts

You can generate the job request with the following command. This will allow you to re-run that job in the future.

%%bash
bacalhau describe ${JOB_ID} --spec > job.yaml
%%bash
cat job.yaml