The BatchX contract
BatchX offers a complete decoupling between image implementation and infrastructure, but in order to understand each other, images must fulfill a series of static and run-time rules that we call the BatchX contract.
Static contract
We refer as static contract to those requirements that images need to meet in order to be imported into BatchX.
In particular, images must:
- Be Linux-based Docker images.
- Contain the BatchX manifest.
BatchX manifest
In order to be imported into BatchX, all images require a manifest, a JSON descriptor that describes the purpose, documentation, origin, usage and input/output messages of the image.
There are several ways of declaring a manifest for a docker image, but the easiest one is to create the image from a Dockerfile
and specify the manifest as a LABEL
instruction.
File manifest (recommended)
Our recommendation is to store the manifest in file at /batchx/manifest/manifest.json
in the image file-system and add a label in the form ofio.batchx.manifest={$version}
right before the instruction that creates or copies this file.
...
LABEL io.batchx.manifest=07
COPY manifest.json /batchx/manifest/manifest.json
info
Latest manifest version is10
. See Manifest version 10 section for more details.
See Hello world image, for a complete example.
Inline manifest
Also, the manifest can be defined directly in the Dockerfile
, as the value of a label with key: io.batchx.manifest-{$version}
...
LABEL 'io.batchx.manifest-10'='{"name":"tutorial/hello-world","version":"1.0.3","title":"BatchX Hello World","schema":{"input":{"$schema":"http://json-schema.org/draft-07/schema#","type":"object","additionalProperties":false,"properties":{"yourName":{"type":"string","required":true,"description":"Your name."}}},"output":{"$schema":"http://json-schema.org/draft-07/schema#","type":"object","additionalProperties":false,"properties":{"responseFile":{"required":true, "type":"string","format":"file","description":"Hello World response file."}}}},"author":"BatchX","readme":"'$README'","changeLog":"Corrected image documentation.","runtime":{"minMem":1000}}'
caution
This method requires escaping some special characters from the manifest, like '
and $
Docker build
A different valid option to attach a BatchX manifest is to pass the label to the build command:
$ docker build --label 'io.batchx.manifest-10'='{"name":"tutorial/hello-world","version":"1.0.3","title":"BatchX Hello World","schema":{"input":{"$schema":"http://json-schema.org/draft-07/schema#","type":"object","additionalProperties":false,"properties":{"yourName":{"type":"string","required":true,"description":"Your name."}}},"output":{"$schema":"http://json-schema.org/draft-07/schema#","type":"object","additionalProperties":false,"properties":{"responseFile":{"required":true, "type":"string","format":"file","description":"Hello World response file."}}}},"author":"BatchX","readme":"'$README'","changeLog":"Corrected image documentation.","runtime":{"minMem":1000}}' .
Run-time contract
The previous (static) requirements determined whether an image could be imported into BatchX.
Now, we will describe the rules that images must follow in order to seamlessly integrate with the platform at run-time. In particular, how the platform passes input and environment information to the container, and how the container passes back output results to the platform.
Environment
BatchX passes the following information to the containers as environment entries:
Environment entry | Description |
---|---|
BX_VCPUS | Number of virtual CPUs |
BX_GPUS | Number of GPUs |
BX_MEMORY | Memory allocated for the container (MB) |
Inputs
When you run an image in the BatchX you specify an input JSON message. Then our platform:
- Provisions a host machine in our cloud.
- Creates a container within the host machine using the job image.
- Mounts a read-only folder
/batchx/input/
in the container file-system, where the input message and referenced input files are stored. - Runs the container entry-point.
Containers should expect the input message of the job available in the/batchx/input/input.json
.
This input message is equivalent to the one originally message passed by the user, with their file references remapped to files inside the /batchx/input/
folder of the container.
info
Image entry-points typically start by reading the input message from the/batch/input/input.json
file, in order to get input values and references to other input files.
Outputs
BatchX requires images to store their output message at/batchx/output/output.json
. This file is read by BatchX after the execution completes.
All the output files referenced from it must be located under /batchx/output/
in order to be uploaded to the BatchX filesystem of the environment.
caution
If one one of the previous requirements fail, the job will end in an INVALID_OUTPUT
status.
Sandboxing
BatchX containers run inside a sandbox with all networking disabled. Hence, BatchX images have to be self-sufficient.
Any external state needed for the computation must be passed in the input though the file-system.