Configuration - Slurm mode
The configuration file of a Ponos agent in Slurm mode is an extension of the generic Ponos agent. Only the slurm
section is added.
These following slurm keys expand environment variables:
|
-
slurm.*.container
, -
slurm.*.script
, -
slurm.*.standard_in
, -
slurm.*.standard_error
, -
slurm.*.standard_output
, -
slurm.*.working_directory
.
slurm.cpu
List of keys and values to be used when submitting Slurm job for Arkindex tasks using CPU. These keys and values will be added to those already present in the base
parameter.
slurm.gpu
List of keys and values to be used when submitting Slurm job for Arkindex tasks using GPU. These keys and values will be added to those already present in the base
parameter.
slurm.daemon
List of keys and values to be used when submitting Slurm job for Ponos agent in Slurm mode. These keys and values will be added to those already present in the base
parameter.
If this parameter is defined (and if the agent is not in a Slurm job), the agent will launch a Ponos agent in a Slurm mode in a Slurm job (with the same configuration) which will automatically requeue itself at the end of its execution time and the current agent will stop.
slurm.fingerprint
Optional parameter to override the MachineId which uniquely identifies the ponos agent. In the slurm context, the agent may run on various nodes with different fingerprints, breaking the unique identification we expect from the agent.
Keys format
For all these parameters, the available keys are the attributes used by PySlurm (JobSubmitDescription).
The standard_output key doesn’t support all filename patterns. Only the %j pattern is supported.
|
Example
Below is an example of a YAML configuration file to start a Ponos agent in a Slurm mode.
# Generic Ponos agent
...
# Ponos agent in a Slurm mode
slurm:
base: # Defaults to `{}`
account: account_ID
cpu: # Defaults to `{}`
partitions: prepost
gpu: # Defaults to `{}`
constraints: v100-32g
qos: qos_gpu-dev
cpus_per_task: 6
ntasks_per_node: 1
daemon: # Defaults to `{}`
standard_output: ponos-%j.out