Skip to main content

Guest demo

This tutorial will guide you through the steps needed for running a bioinformatic image in the guest account.

CLI installation

Once you have received a guest access token from Batchx support, please follow the steps described at Installation.

When asked for the user name press enter, so you can then enter the provided token.

Enter user name (empty to enter token):
Enter token:

The token allows you to connect to the guest user environment.

Verify identity

Before we start with the demo please verify your identity by running bx whoami. You should see something similar to:

$ bx whoami
User id: guest
Name: Guest user
Email: support@batchx.io
Created: Fri Sep 18 14:39:22 CEST 2020

Demo

In this tutorial we will be running bwa mem, a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. This tool has already been imported into BatchX and is offered as part of our bioinformatic tools catalogue.

BatchX image

Search for the bioinformatics/bwa/mem image in the BatchX catalogue by using the bx images command as follows:

bx images -e=batchx

This will display the whole list of ready-to-use images that BatchX provides for its users.

You can limit the output searching for bwa/mem by using grep:

$ bx images -e=batchx | grep bwa/mem
batchx@bioinformatics/bwa/mem:1.2.1 7 weeks ago 487.2 MB

Now, let's get that image into the guest environment by running the bx clone command:

$ bx clone batchx@bioinformatics/bwa/mem
$ bx images
IMAGE CREATED SIZE
batchx@bioinformatics/bwa/mem:1.2.1 7 weeks ago 487.2 MB

The above command returned the highest version that have been created for the bioinformatics/bwa/mem image. Copy the coordinates for the 1.2.1 version ( latest version at the time) and use the bx image command to see its details:

bx image batchx@bioinformatics/bwa/mem:1.2.1

This should have displayed the contents of the bioinformatics/bwa/mem manifest, including a description of the inputs and outputs associated with this image. Read more about the contents of the manifest here.

Input data

Now that we have inspected the details of the bioinformatics/bwa/mem image, use the bx ls command to see the files we are using for this example, which are going to be passed to this image when launching the job.

bx ls

The previous command displayed the contents of the file system at the root level. You should see a directory named readonly. Use the bx tree command to display the contents of this directory as an depth-indented list of files:

$ bx tree readonly
readonly
├── bwa
│ └── hg38
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.dict
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.amb
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.ann
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.bwt
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.dict
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.fai
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.gzi
│ ├── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.pac
│ └── Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz.sa
└── fastqs
├── sample.R1.fq.gz
└── sample.R2.fq.gz
info

readonlyhere is just the name of a folder (that happens to be read-only for the provided access token)

Submit job

Now that we have examined the image and the inputs we are using for this tutorial it’s time to submit a job to BatchX. For this, we will use the bx submit command as follows:

bx submit -n batchx@bioinformatics/bwa/mem:1.2.1 '{"fastqFileR1":"readonly/fastqs/sample.R1.fq.gz","fastqFileR2":"readonly/fastqs/sample.R2.fq.gz","refFolder":"readonly/bwa/hg38","refBaseName":"Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz","outputBamName":"sample.bam"}'

You have now submitted a job to BatchX to “align query sequences using bwa-mem”, a very common task performed in bioinformatics when dealing with NGS (Next Generation Sequencing) data.

Job details

Use the bx jobs command to see the list of active jobs submitted to BatchX. Feel free to check the extended details for this command.

bx jobs

You can also use the bx attach command to receive live streams from this job. This command requires specifying the id of the job you want to attach to (returned by the submit command)

bx attach <job-id>
info

If this is not the first job you submit type bx jobs -al 1 to see the id of the last submitted job, which should correspond to the job from this tutorial.

Ultimately, if everything went well, the job will finish with a Job status: SUCCEEDED message and an additional line displaying the path to the output file.

In a real bioinformatics scenario you could use this output file as part of an analysis workflow to identify chromosome alterations, point mutations or other types of variants.

Continue exploring the details of this job with the bx job and the bx logs command.