.. _apftask: apftask - task coordination service =================================== The apftask service is used by discrete APF processes to declare themselves as "tasks". The objective is to provide two-way feedback: status reports from the task process itself, and task control directives from a higher-level control mechanism. Note that *starting* a task is not presently supported by the apftask service, though that could be implemented with a modest extension of the task architecture. Each defined task has a templated set of keywords established on its behalf; custom keywords can be added to support the needs of individual tasks. Establishing new tasks and adding keywords to existing tasks are both straightforward, and require minimal time to push through. .. _apftask_tasks: What is an APF task? -------------------- A task is any discrete operation whose status is of interest to other software components. At the highest level, a single piece of software may be stepping through a sequence of observations and preparatory steps; if each subsystem used by that high level software is a defined task, it can use the same fundamental approach to monitor and interact with each of these subsystems. That high-level operations sequencer should itself be established as a task, even if only to simplify monitoring its status. Example tasks could include taking calibrations, running a focus cube, or performing an observation of a designated star. In each of these cases, the operation takes an extended time to complete, and is by no means atomic. Each of these examples also works through phases of operation, and in some cases, iterative steps within a discrete phase. Simple operations that are reasonably atomic are not well suited to representation as a task; if an operation can be described by a single KTL keyword, or even a small set of related KTL keywords, it will be more straightforward to use those keywords directly. For example, it would make little sense to establish a task that reports the status of setting (or clearing) the emergency stop at the APF. The full set of tasks known to the apftask service is published via the ``TASKS`` keyword. .. _apftask_keywords: Task keywords ------------- There are two classes of keywords established for every task: keywords of both internal and *external* interest, and *internal* keywords used by the apftask dispatcher and its support applications to monitor tasks. Unless stated otherwise, all task-related keyword values are configured to be cached, and will persist not only across task restarts, but also across restarts of the apftask dispatcher itself. In the descriptions below, the task prefix will be ``TASK``, where normally it would be the prefix appropriate for that specific task; for example, the documentation below lists ``TASK_CONTROL``, but the keyword for a specific task may be ``CALIBRATE_CONTROL``, or ``SCRIPTOBS_CONTROL``. .. _apftask_external_keywords: External keywords ----------------- .. _apftask_CONTROL: * ``TASK_CONTROL``: the ``CONTROL`` keyword is set to one of a few discrete values; it is incumbent upon the task implementation to honor a given ``CONTROL`` request. ======= ================================================== Value Action ======= ================================================== Proceed The task should proceed normally. Pause The task should pause operations, but not exit. Abort The task should cease operations and exit cleanly. ======= ================================================== If a task is paused, the ``CONTROL`` keyword should be set to *Proceed* to signal the task to resume activities. If the task exits for any reason, the ``CONTROL`` keyword will automatically reset to *Proceed*. The ``CONTROL`` keyword cannot be set if a task is not running. .. _apftask_STATUS: * ``TASK_STATUS``: the ``STATUS`` keyword reflects the current overall state of the task process. Typically, the only status values set by a task implementation are the exit status; the remainder are generally set by the apftask as a result of other status changes, in particular, the internal ``PS_STATE`` keyword. ============== ================================================ Value Meaning ============== ================================================ Running The task is currently running. *Running* is set when a task establishes itself. Pausing *Pausing* is set by the dispatcher after the ``CONTROL`` keyword is set to *Pause*. If the ``CONTROL`` modify request was blocking, it will block until the status changes to *Paused*. Paused The task has successfully paused. *Paused* is either set directly by the task implementation, or implicitly by the apftask interface toolkit. Exited/Success The task has successfully completed. *Exited/Success* is only set by the task implementation, and should be the last operation performed before the task exits. Exited/Failure The task failed to successfully complete. *Exited/Failure* is only set by the task implementation, and should be the last operation performed as part of an error-handling routine before the task exits. Exited/Unknown The task did not set a status before exiting. *Exited/Unknown* is only set by the dispatcher if a task was in a non-exited state and it receives notification that the task is no longer running. ============== ================================================ .. _apftask_MESSAGE: * ``TASK_MESSAGE``: the ``MESSAGE`` keyword provides descriptive feedback about the activities of the task. It can be set by the apftask dispatcher, by the apftask interface toolkit, or by the task implementation. .. _apftask_PHASE: * ``TASK_PHASE``: the ``PHASE`` keyword is only set by the task implementation. It should be set to a descriptive string for each discrete phase of operations that the task enters. This information is useful not only for reporting, but could also be used by the task implementation to resume operations if its previous run was interrupted. .. _apftask_STEP: * ``TASK_STEP``: the ``STEP`` keyword will reset itself to zero every time the ``PHASE`` keyword changes. The ``STEP`` can be used to count off repetitive steps within a discrete phase. .. _apftask_LAST_START: * ``TASK_LAST_START``: the ``LAST_START`` keyword is a UNIX timestamp, and it will automatically set itself to the current time whenever a task successfully establishes itself. If this value will be queried by a task, perhaps as part of an assessment of whether to start anew or resume from the currently set phase+step, it should be queried before the task establishes itself. .. _apftask_LAST_SUCCESS: * ``TASK_LAST_SUCCESS``: the ``LAST_SUCCESS`` keyword is a UNIX timestamp, and it will automatically set itself to the current time whenever a task exits with ``STATUS`` 'Exited/Success'. In addition to the above, there can be arbitrary per-task keywords used to communicate specific information. As of the writing of this document (December 2013), there are a set of three arbitrary string keywords (``TASK_VAR_1``, ``TASK_VAR_2``, and ``TASK_VAR_3``) that are being phased out in favor of more specific keywords. .. _apftask_internal_keywords: Internal keywords ----------------- .. _apftask_PID: * ``TASK_PID``: the process ID of the running task. A task *establishes* itself by setting its PID and RUNHOST keywords. The taskmon helper daemon on each host uses these keywords to identify all tasks on that host. When taskmon asserts that the task is no longer running (by clearing the ``TASK_PS_STATE`` value) the apftask dispatcher will reset the ``PID`` keyword to -1 and the ``RUNHOST`` keyword to the empty string. .. _apftask_RUNHOST: * ``TASK_RUNHOST``: the hostname on which a given task is running. A task *establishes* itself by setting its PID and RUNHOST keywords. The taskmon helper daemon on each host uses these keywords to identify all tasks on that host. When taskmon verifies that the task is no longer running, taskmon will reset the ``PID`` keyword to -1 and the ``RUNHOST`` keyword to the empty string. .. _apftask_PS_STATE: * ``TASK_PS_STATE``: the ``PS_STATE`` keyword is used by the taskmon helper application to communicate the process state of a running task back to the apftask dispatcher. In particular, when ``PS_STATE`` is set to the empty string, the apftask dispatcher interprets this to mean that the process is no longer running. ``PS_STATE`` should only ever be set by taskmon. .. _apftask_SIGNAL: * ``TASK_SIGNAL``: if set to a non-None value, the ``SIGNAL`` keyword defines the signal that taskmon will send to the running task in the event that ``INTERRUPT`` transitions to True. .. _apftask_TRIPWIRE: * ``TASK_TRIPWIRE``: if set to a non-None value, the ``TRIPWIRE`` keyword lists the conditions that will be used to set the ``INTERRUPT`` keyword. ============== ==================================================== Condition Abort when... ============== ==================================================== OPEN_OK ...the ``checkapf.OPEN_OK`` keyword is False. This indicates that the dome shutter and/or vents should not be opened. MOVE_PERM ...the ``checkapf.MOVE_PERM`` keyword is False. This indicates that the telescope and any movable components on the telescope should not be moved, largely for personnel safety reasons. INSTR_PERM ...the ``checkapf.INSTR_PERM`` keyword is False. This indicates that no components in the the Levy spectrometer should be commanded to move, nor should lamps be asked to turn on. Stopping stages and turning off lamps are both permitted. TASK_ABORT ...the task's ``CONTROL`` keyword is set to *Abort*. This indicates that the task should exit as soon as reasonably possible. TASK_PAUSE ...the task's ``CONTROL`` keyword is set to *Pause*. This indicates that the task should pause as soon as reasonably possible. ============== ==================================================== .. _apftask_INTERRUPT: * ``TASK_INTERRUPT``: will be set to True if any of the conditions described by ``TRIPWIRE`` are met. If ``SIGNAL`` is set to a non-None value, the taskmon helper application will send the requested signal to the running task when ``INTERRUPT`` transitions to True. .. _apftask_taskmon: The taskmon helper application ------------------------------ taskmon runs on each of the APF linux hosts, and monitors all tasks that are running on that host. It does this by monitoring all ``TASK_PID`` and ``TASK_RUNHOST`` keywords; if the ``RUNHOST`` matches the hostname where taskmon is running, it will poll the contents of /proc//status, and write out the current status to the ``TASK_PS_STATE`` keyword. If taskmon sees that the process is no longer running, it will set ``PS_STATE`` to the empty string. This is a critical piece of feedback for the apftask dispatcher, and will trigger a cascade of automatic updates, including clearing the ``PID`` and ``RUNHOST`` keywords for that task. taskmon also monitors ``TASK_CONTROL`` keywords and a small set of keywords from the checkapf service as part of its handling for ``TASK_TRIPWIRE`` keywords. taskmon runs as root, and will signal the task if any of its ``TRIPWIRE`` conditions are met. .. _apftask_interface: The interface toolkit --------------------- An interface toolkit was created to simplify task implementations' interactions with the apftask service. Tasks implemented in Python can make direct use of the :mod`APFTask` module; tasks written in other languages can use the apftask command-line tool, which is a script-friendly wrapper to the core functions of the :mod:`APFTask` module. An example task implementation was written to demonstrate the intended use of the apftask command-line interface; this example can be found here:: cvs/lroot/apf/apftask/interface/example.sin The command-line interface has the following options available: .. include:: ../build/apftask.txt :literal: