Skip to main content

Glossary

Image

Images define the computation performed by the jobs, packaging up the programs with all of the parts they need, such as libraries and other dependencies.

CAS

A BatchX file-system is implemented as a content-addressable storage, a way to store immutable information so it can be retrieved based on its content, not on its location. This offers the benefits of avoiding increasing storage quota when duplicated content is added or aliased.

Coordinates

Coordinates are used for referring to and identifying BatchX images.

Container

Containers are isolated and restricted groups of resources (processes, files) sharing a common Linux kernel. In a way, containers are a bit like a virtual machine, but rather than creating a whole virtual operating system, containers allow applications to use the same Linux kernel as the system that they're running on. This gives a significant performance boost and reduces the size of the application.

Credits

Credits are prepaid computational units belonging to an environment.

Docker

Docker is an open-source tool designed to make it easier to create, deploy, and run applications by using containers.

It allows to package up an application with all of the parts it needs, such as libraries and other dependencies in an image, so it's ready to be used without any further configuration.

A Docker image is a file that represents a packaged application with all the dependencies needed to run correctly. It is a template to create containers.

Also, Docker images are defined as hierarchical entities and this favors mechanisms for efficient transmission over the wire, moving only the parts that the recipient lacks.

Environment

BatchX environments are isolated operational scopes comprised of image repository, file-system, job registry and constrained under storage quota and computation credits limitations.

GPU

Acronym for graphics processing unit.

Immutability

All BatchX assets are immutable (images, jobs and files), meaning that once created they don't change. This is one of the pillars for achieving reproducibility.

Job

The execution of an image. See Jobs section for more details.

JSON

JSON is an open standard file format, and data interchange format, that uses human-readable text to store and transmit data objects.

BatchX uses JSON as the interchange format for the input and output messages of the jobs.

JSON Schema

JSON schema specifies a JSON-based format to define the structure of JSON data for validation, documentation, and interaction control. It provides a contract for the JSON data required by a given application, and how that data can be modified.

BatchX uses JSON Schema in the manifest of the images, as the way for describing the structure and restrictions of their input and output messages.

Manifest

The BatchX manifest is a descriptor that all images must contain, in order to be imported into the platform.

Organization

BatchX organizations allow multiple BatchX users to collaborate, sharing a common environment.

Pipeline

We refer as "pipeline" (also "workflow") to the client-side orchestration of multiple computational steps, each one implemented as a BatchX image.

Registry

A Docker registry is a remote service that hosts Docker images, grouped by repositories.

Repository

A Docker repository is a collection of Docker images with the same name, but different tag.

Reproducibility

The ability to run a job in the future in the exact same conditions, producing the same output. It is extremely important for debugging and reasoning about algorithms, as well as a requirement for academia or scientific research publishing.

See Reproducibility for more details.

Tag

A Docker tag is a human readable name given to a Docker image. Docker tags are not unique nor immutable.

Token

A sequence of characters used to encapsulate or represent the security identity and permissions of a user.

BatchX stores external tokens for accessing third-party servers, as well as provides BatchX access tokens for authenticating the CLI sessions.

Virtual CPUs

Job computation resources are specified in terms of "virtual CPUs".

As of today, BatchX uses AWS infrastructure where virtual CPUs are Intel Xeon hyper-threads (logical cores)

Version

See image coordinates.

Virtual machine

See container.

Workflow

See pipeline.