Tasks
When a process is started, its worker runs are converted into tasks.
A task is the execution of a single worker run. There may be multiple tasks per worker run when the process has been configured with chunks, has been retried, or when a single task has been restarted.
Task runs
The status page of each process provides an overview on the tasks of the process, grouped by run. Starting a process will create the first run, and retrying a process will create the subsequent runs.
A task run is only a number that helps with this grouping, and has no other effect on the execution of a process.
When there is only one run in a process, the runs are not displayed.
| Task runs are not to be confused with worker runs. |
Task states
As a task gets executed, it will go through multiple states. These provide a general indication of its health.
- Unscheduled
-
This task is not ready to be executed. This is the default state of a task once it gets created. If a task appears to be stuck in this state, there may be multiple causes:
-
This task depends on other tasks that have not yet entered a Completed state.
-
In Community Edition, an error occurred after creating the process and the tasks could not be scheduled.
-
In Enterprise Edition, for tasks that do not depend on any other task, there are no agents available to process this task yet.
-
- Pending
-
All the conditions are met for this task to be started, but the task is waiting to be executed by an asynchronous worker.
If a task appears to be stuck in this state, there may be no asynchronous workers available to execute it. In particular, tasks for worker runs that require a GPU are restricted to hosts that have a GPU available, which could reduce the number of hosts that this task could run on.
- Running
-
The task is currently being executed. Detailed updates may be visible through the logs of the task.
- Completed
-
The task has finished executing and was successful. It cannot be restarted, but retrying the process will still re-execute it.
- Task error
-
The task ran, but failed. The task’s logs likely contain further details on the error.
In the Arkindex API, this is also known as
failed. - System error
-
An error occurred in the asynchronous worker or agent at any step of a task’s execution: preparing it, starting it, reporting on its status, uploading its artifacts, etc. The task’s logs likely contain further details on the error. You may need to contact a system administrator for further assistance.
In the Arkindex API, this is also known as
error. - Cancelled
-
This task has reached its time limit and its execution has been automatically stopped.
- Stopping
-
A user has requested for this task to be stopped. The asynchronous worker or agent that executes this task either has not yet received this instruction, or is busy stopping the task.
- Stopped
-
The task has been stopped after a manual request.
Environment variables
Some environment variables are automatically defined by Arkindex and provided to tasks depending on the context in which they execute.
PONOS_DATA-
Directory where a task will be able to find artifacts from the tasks that it depends on. One subdirectory will be created for each parent task, named after the UUID of each task. A symbolic link also allows to access these directories by the slug of the task.
Another subdirectory exists with the current task’s UUID, along with a symbolic link named
current. Any file created within this directory will be saved as an artifact of this task after it finishes. PONOS_TASK-
The UUID of the current task.
ARKINDEX_PROCESS_ID-
UUID of the process that this task is executing for.
ARKINDEX_WORKER_RUN_ID-
UUID of the worker run of this task. In some cases, tasks may not have a defined worker run, so this variable may be missing.
ARKINDEX_API_URL-
URL to the root of the Arkindex API. This variable can be detected and used automatically by the Python API client.
ARKINDEX_API_CSRF_COOKIE-
Name of the cookie used for cross-site resource forgery protection in the Arkindex API. This variable can be detected and used automatically by the Python API client.
ARKINDEX_TASK_TOKEN-
A token to be used for authentication with the API. This variable can be detected and used automatically by the Python API client.
ARKINDEX_TASK_CHUNK-
When the process has been started with chunks, set to the chunk number of the current task. For example, in a process with 2 chunks, two duplicate tasks are created, one with
ARKINDEX_TASK_CHUNKset to1in the first one and2in the second one. TASK_ELEMENTS-
Path to a JSON file that contains a list of elements to process. This is only available in inference processes, when it has not been disabled in the advanced settings, or for the thumbnail generation system worker after a file import.
S3 ingestion processes
The following environment variables are only set in the context of an S3 ingestion process.
INGEST_S3_ACCESS_KEY-
Access key ID for authentication to the S3 bucket to import from. Set from the
ingest.access_key_idsetting. INGEST_S3_SECRET_KEY-
Secret access key for authentication to the S3 bucket to import from. Set from the
ingest.secret_access_keysetting. INGEST_S3_ENDPOINT-
A custom endpoint to use to connect to the S3 bucket, instead of Amazon Web Services. Set from the
ingest.endpointsetting. When this setting is missing, the variable will not be set. INGEST_S3_REGION-
Region to use to connect to the S3 bucket through Amazon Web Services. Set from the
ingest.regionsetting. When this setting is missing, the variable will not be set.
Time limit
A maximum execution time limit can be configured to control the use of infrastructure resources. There are three levels of granularity on the time limit:
-
The
ponos.maximum_task_ttlallows system administrators to define a default system-wide time limit. -
A time limit can be configured per project using the administration interface. When it is set, it will override the system-wide limit for all tasks created in all processes within this project.
-
The time limit can be updated for individual tasks, after they have been created, using the administration interface. This limit will be reused for any restarts of this task.
When the time limit of a task is set to zero, the execution time becomes unlimited.
| It is recommended to set the system-wide time limit to the strictest limit given to the least trusted users of an instance, then to intervene per project or task to increase those limits as needed. This is particularly important to prevent abuse on instances that allow registrations. |
Agents
| This feature is only available in Enterprise Edition. In Community Edition, tasks have no agents. |
In Enterprise Edition, tasks are executed by Ponos agents. The agent to which a task is assigned is shown next to the task’s name.
In the process list, the View agents button in the top-right corner opens a separate page. This page provides an overview of the hardware resources available on each agent, as well as the tasks assigned to it.
Farms
| This feature is only available in Enterprise Edition. In Community Edition, there are no agents and no farms. |
Every agent is part of exactly one farm. A farm provides an authentication token called a seed that allows each agent to self-register on an Arkindex instance, so that agents can be deployed on the fly.
When configuring a process, a farm can be set on the process through the advanced settings. The tasks in this process will run on any agent within this farm.
When no farm has been explicitly configured, a default farm is used. The ponos.default_farm option sets the default farm to use in the backend configuration.
Farms have access rights. Users will need guest access to a farm in order to use it in a process.
| On an instance that allows registrations, system administrators can grant access to the default farm to a default group to allow users to execute processes immediately after registration. |