Command Line Interface
The Merlin library defines a number of commands to help configure your server and manage and monitor your workflow.
This module will detail every command available with Merlin.
Merlin
The entrypoint to everything related to executing Merlin commands.
Usage:
Options:
Name | Type | Description | Default | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
||||||||||
--version |
boolean | Show program's version number and exit | False |
||||||||||
-lvl , --level |
choice(ERROR | WARNING | INFO | DEBUG ) |
Level of logging messages to be output. The smaller the number in the table below, the more output that's produced:
|
INFO |
See the Configuration Commands, Workflow Management Commands, and Monitoring Commands below for more information on every command available with the Merlin library.
Configuration Commands
Since running Merlin in a distributed manner requires the configuration of a centralized server, Merlin comes equipped with three commands to help users get this set up:
- config: Create the skeleton
app.yaml
file needed for configuration - info: Ensure stable connections to the server(s)
- server: Spin up containerized servers
Config (merlin config
)
Create a default config (app.yaml) file in the ${HOME}/.merlin
directory using the config
command. This file can then be edited for your system configuration.
See more information on how to set this file up at the Configuration page.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--task_server |
string | Select the appropriate configuration for the given task server. Currently only "celery" is implemented. | "celery" |
-o , --output_dir |
path | Output the configuration in the given directory. This file can then be edited and copied into ${HOME}/.merlin . |
None |
--broker |
string | Write the initial app.yaml config file for either a rabbitmq or redis broker. The default is rabbitmq . The backend will be redis in both cases. The redis backend in the rabbitmq config shows the use on encryption for the backend. |
"rabbitmq" |
Examples:
Info (merlin info
)
Information about your Merlin and Python configuration can be printed out by using the info
command. This is helpful for debugging. Included in this command is a server check which will check for server connections. The connection check will timeout after 60 seconds.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
Server (merlin server
)
Create a local containerized server for Merlin to connect to. Merlin server creates and configures a server on the current directory. This allows multiple instances of Merlin server to exist for different studies or uses.
Merlin server has a list of commands for interacting with the broker and results server. These commands allow the user to manage and monitor the exisiting server and create instances of servers if needed.
More information on configuring with Merlin server can be found at the Merlin Server Configuration page.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
Commands:
Name | Description |
---|---|
init | Initialize the files needed for Merlin server |
status | Check the status of your Merlin server |
start | Start the containerized Merlin server |
stop | Stop the Merlin server |
restart | Restart an instance of the Merlin server |
config | Configure the Merlin server |
Server Init (merlin server init
)
Note
If there is an exisiting subdirectory containing a merlin server configuration then only missing files will be replaced. However it is recommended that users backup their local configurations prior to running this command.
The init
subcommand initalizes a new instance of Merlin server by creating configurations for other subcommands.
A main Merlin sever configuration subdirectory is created at ~/.merlin/server/
which contains configuration for local Merlin configuration, and configurations for different containerized services that Merlin server supports, which includes Singularity (Docker and Podman implemented in the future).
A local Merlin server configuration subdirectory called merlin_server/
will also be created in your current working directory when this command is run. This will include a container for merlin server and associated configuration files that might be used to start the server. For example, for a redis server a "redis.conf" will contain settings which will be dynamically loaded when the redis server is run. This local configuration will also contain information about currently running containers as well.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
Server Status (merlin server status
)
The status
subcommand checks the status of the Merlin server.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
Server Start (merlin server start
)
Warning
Newer versions of Redis have started requiring a global variable LC_ALL
to be set in order for this to work. To set this properly, run:
If this is not set, the merlin server start
command may seem to hang until you manually terminate it.
The start
subcommand starts the Merlin server using the container located in the local merlin server configuration.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
Server Stop (merlin server stop
)
The stop
subcommand stops any exisiting container being managed and monitored by Merlin server.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
Server Restart (merlin server restart
)
The restart
subcommand performs a stop
command followed by a start
command on the Merlin server.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
Server Config (merlin server config
)
The config
subcommand edits configurations for the Merlin server. There are multiple options to allow for different configurations.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
-ip , --ipadress |
string | Set the binded IP address for Merlin server | None |
-p , --port |
integer | Set the binded port for Merlin server | None |
-pwd , --password |
filename | Set the password file for Merlin server | None |
--add-user |
string string | Add a new user for Merlin server. This requires a space-delimited username and password as input. | None |
--remove-user |
string | Remove an existing user from Merlin server | None |
-d , --directory |
path | Set the working directory for Merlin server | None |
-ss , --snapshot-seconds |
integer | Set the number of seconds before each snapshot | None |
-sc , --snapshot-changes |
integer | Set the number of database changes before each snapshot | None |
-sf , --snapshot-file |
filename | Set the name of the snapshot file | None |
-am , --append-mode |
choice(always | everysec | no ) |
Set the appendonly mode | None |
-af , --append-file |
filename | Set the name of the file for the server append/change file | None |
Examples:
Add A User and Set Snapshot File
Workflow Management Commands
The Merlin library provides several commands for setting up and managing your Merlin workflow:
- example: Download pre-made workflow specifications that can be modified for your own workflow needs
- purge: Clear any tasks that are currently living in the central server
- restart: Restart a workflow
- run: Send tasks to the central server
- run workers: Start up workers that will execute the tasks that exist on the central server
- stop workers: Stop existing workers
Example (merlin example
)
If you want to obtain an example workflow, use Merlin's merlin example
command. First, view all of the example workflows that are available with:
This will list the available example workflows and a description for each one. To select one:
This will copy the example workflow to the current working directory. It is possible to specify another path to copy to.
If the specified directory does not exist Merlin will automatically create it.
This will generate the example workflow at the specified location, ready to be run.
For more information on these examples, visit the Examples page.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
-p , --path |
path | A directory path to download the example to | Current Working Directory |
Purge (merlin purge
)
Warning
Any tasks reserved by workers will not be purged from the queues. All workers must be first stopped so the tasks can be returned to the task server and then they can be purged.
In short, you probably want to use merlin stop-workers
before running merlin purge
.
If you've executed the merlin run
command and sent tasks to the server, this command can be used to remove those tasks from the server. If there are no tasks currently on the server then this command will not do anything.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
-f |
boolean | Purge tasks without confirmation | False |
--steps |
List[string] | A space-delimited list of steps from the specification file to purge | ['all'] |
--vars |
List[string] | A space-delimited list of variables to override in the spec file. Ex: --vars MY_QUEUE=hello |
None |
Examples:
Purge All Queues From Spec File
The following command will purge all queues that exist in my_specification.yaml
:
Purge Specific Steps From Spec File
The following command will purge any queues associated with step_1
and step_3
in my_specification.yaml
:
Purge Queues Without Confirmation
The following command will ignore the confirmation prompt that's provided and purge the queues:
Restart (merlin restart
)
To restart a previously started Merlin workflow, use the restart
command and the path to root of the Merlin workspace that was generated during the previously run workflow. This will define the tasks and queue them on the task server also called the broker.
Merlin currently writes file called MERLIN_FINISHED
to the directory of each step that was finished successfully. It uses this to determine which steps to skip during execution of a restarted workflow.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--local |
string | Run tasks sequentially in your current shell | "distributed" |
Examples:
Run (merlin run
)
To run a Merlin workflow use the run
command and the path to the input yaml file <input.yaml>
. This will define the tasks and queue them on the task server also called the broker.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--local |
string | Run tasks sequentially in your current shell | "distributed" |
--vars |
List[string] | A space-delimited list of variables to override in the spec file. This list should be given after the spec file is provided. Ex: --vars LEARN=/path/to/new_learn.py EPOCHS=3 |
None |
--samplesfile |
choice(<filename>.npy | <filename>.csv | <filename>.tab ) |
Specify a file containing samples. This file should be given after the spec file is provided. | None |
--dry |
boolean | Do a Dry Run of your workflow | False |
--no-errors |
boolean | Silence the errors thrown when flux is not present | False |
--pgen |
filename | Specify a parameter generator filename to override the global.parameters block of your spec file |
None |
--pargs |
string | A string that represents a single argument to pass a custom parameter generation function. Reuse --parg to pass multiple arguments. [Use with --pgen ] |
None |
Examples:
Run Workers (merlin run-workers
)
The tasks queued on the broker by the merlin run
command are run by a collection of workers. These workers can be run local in the current shell or in parallel on a batch allocation. The workers are launched using the run-workers
command which reads the configuration for the worker launch from the <input.yaml>
file.
Within the <input.yaml>
file, the batch
and merlin.resources.workers
sections are both used to configure the worker launch. The top level batch
section can be overridden in the merlin.resources.workers
section. Parallel workers should be scheduled using the system's batch scheduler (see the section describing Distributed Runs for more info).
Once the workers are running, tasks from the broker will be processed.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--echo |
boolean | Echo the Celery workers run command to stdout and don't start any workers | False |
--worker-args |
string | Pass arguments (all wrapped in quotes) to the Celery workers. Should be given after the input spec. | None |
--steps |
List[string] | The specific steps in the input spec that you want to run the corresponding workers for. Should be given after the input spec. | ['all'] |
--vars |
List[string] | A space-delimited list of variables to override in the spec file. This list should be given after the spec file is provided. Ex: --vars SIMWORKER=new_sim_worker |
None |
--disable-logs |
boolean | Disable logs for Celery workers. Note: Having the -l flag in your workers' args section will overwrite this flag for that worker. |
False |
Examples:
Worker Launch with Worker Args Passed
Stop Workers (merlin stop-workers
)
Warning
If you've named workers identically across workflows (you shouldn't) only one might get the signal. In this case, you can send it again.
Send out a stop signal to some or all connected workers. By default, a stop will be sent to all connected workers across all workflows, having them shutdown softly. This behavior can be modified with certain options.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--spec |
filename | Target only the workers named in the merlin block of the spec file given here |
None |
--queues |
List[string] | Takes a space-delimited list of specific queues as input and will stop all workers watching these queues | None |
--workers |
List[regex] | A space-delimited list of regular expressions representing workers to stop | None |
--task_server |
string | Task server type for which to stop the workers. Currently only "celery" is implemented. | "celery" |
Examples:
Monitoring Commands
The Merlin library comes equipped with several commands to help monitor your workflow:
- detailed-status: Display task-by-task status information for a study
- monitor: Keep your allocation alive while tasks are being processed
- query-workers: Communicate with Celery to view information on active workers
- queue-info: Communicate with Celery to view the status of queues in your workflow(s)
- status: Display a summary of the status of a study
More information on all of these commands can be found below and in the Monitoring documentation.
Detailed Status (merlin detailed-status
)
Warning
For the pager opened by this command to work properly the MANPAGER
or PAGER
environment variable must be set to less -r
. This can be set with:
Display the task-by-task status of a workflow.
This command will open a pager window with task statuses. Inside this pager window, you can search and scroll through task statuses for every step of your workflow.
For more information, see the Detailed Status documentation.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--dump |
filename | The name of a csv or json file to dump the status to | None |
--task_server |
string | Task server type. Currently only "celery" is implemented. | "celery" |
-o , --output-path |
dirname | Specify a location to look for output workspaces. Only used when a spec file is passed as the argument to status . |
None |
Filter Options:
The detailed-status
command comes equipped with several options to help filter the output of your status query.
Name | Type | Description | Default |
---|---|---|---|
--max-tasks |
integer | Sets a limit on how many tasks can be displayed. | None |
--return-code |
List[string] | Filter which tasks to display based on their return code. Multiple return codes can be provided using a space-delimited list. Options: SUCCESS , SOFT_FAIL , HARD_FAIL , STOP_WORKERS , RETRY , DRY_SUCCESS , UNRECOGNIZED . |
None |
--steps |
List[string] | Filter which tasks to display based on the steps that they're associated with. Multiple steps can be provided using a space-delimited list. | ['all'] |
--task-queues |
List[string] | Filter which tasks to display based on a the task queues that they were/are in. Multiple task queues can be provided using a space-delimited list. | None |
--task-status |
List[string] | Filter which tasks to display based on their status. Multiple statuses can be provided using a space-delimited list. Options: INITIALIZED , RUNNING , FINISHED , FAILED , CANCELLED , DRY_RUN , UNKNOWN . |
None |
--workers |
List[string] | Filter which tasks to display based on which workers are processing them. Multiple workers can be provided using a space-delimited list. | None |
Display Options:
There are multiple options to modify the way task statuses are displayed.
Name | Type | Description | Default |
---|---|---|---|
--disable-pager |
boolean | Turn off the pager functionality when viewing the task-by-task status. Caution: This option is not recommended for large workflows as you could freeze your terminal with thousands of task statuses. | False |
--disable-theme |
boolean | Turn off styling for the status layout. | False |
--layout |
string | Alternate task-by-task status display layouts. Options: table , default . |
default |
--no-prompts |
boolean | Ignore any prompts provided. This cause the detailed-status command to default to the latest study if you provide a spec file as input. |
False |
Examples:
Check the Detailed Status Using Workspace as Input
Check the Detailed Status Using a Specification as Input
This will look in the OUTPUT_PATH
Reserved Variable defined within the spec file to try to find existing workspace directories associated with this spec file. If more than one are found, a prompt will be displayed for you to select a workspace directory.
Dump the Status Report to a JSON File
Display the First 8 Successful Tasks
Monitor (merlin monitor
)
Batch submission scripts may not keep the batch allocation alive if there is not a blocking process in the submission script. The merlin monitor
command addresses this by providing a blocking process that checks for tasks in the queues every (sleep) seconds ("sleep" here can be defined with the --sleep
option). When the queues are empty, the monitor will query Celery to see if any workers are still processing tasks from the queues. If no workers are processing any tasks from the queues and the queues are empty, the blocking process will exit and allow the allocation to end.
The monitor
functionality will check for Celery workers for up to 10*(sleep) seconds before monitoring begins. The loop happens when the queue(s) in the spec contain tasks, but no running workers are detected. This is to protect against a failed worker launch.
For more information, see the Monitoring Studies for Persistent Allocations documentation.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--steps |
List[string] | A space-delimited list of steps in the input spec that you want to query. Should be given after the input spec. | ['all'] |
--vars |
List[string] | A space-delimited list of variables to override in the spec file. This list should be given after the spec file is provided. Ex: --vars SIMWORKER=new_sim_worker |
None |
--sleep |
integer | The duration in seconds between checks for workers/tasks | 60 |
--task_server |
string | Task server type for which to monitor the workers. Currently only "celery" is implemented. | "celery" |
Query Workers (merlin query-workers
)
Check which workers are currently connected to the task server.
This will broadcast a command to all connected workers and print the names of any that respond and the queues they're attached to. This is useful for interacting with workers, such as via merlin stop-workers --workers
.
For more information, see the Query Workers documentation.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--task_server |
string | Task server type for which to query workers. Currently only "celery" is implemented. | "celery" |
--spec |
filename | Query for the workers named in the merlin block of the spec file given here |
None |
--queues |
List[string] | Takes a space-delimited list of queues as input. This will query for workers associated with the names of the queues you provide here. | None |
--workers |
List[regex] | A space-delimited list of regular expressions representing workers to query | None |
Examples:
Query Workers Based on Their Name
This will query a worker named step_1_worker
:
Query Workers Using Regex
This will query only workers whose names start with step
:
Queue Info (merlin queue-info
)
Note
Prior to Merlin v1.12.0 the merlin status
command would produce the same output as merlin queue-info --spec <spec_file>
Check the status of queues to see if there are any tasks in them and/or any workers watching them.
If used without the --spec
option, this will query any active queues. Active queues are queues that have a worker watching them.
For more information, see the Queue Information documentation.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--dump |
filename | The name of a csv or json file to dump the queue information to | None |
--specific-queues |
List[string] | A space-delimited list of queues to get information on | None |
--task_server |
string | Task server type. Currently only "celery" is implemented. | "celery" |
Specification Options:
These options all must be used with the --spec
option if used.
Name | Type | Description | Default |
---|---|---|---|
--spec |
filename | Query for the queues named in each step of the spec file given here | None |
--steps |
List[string] | A space-delimited list of steps in the input spec that you want to query. Should be given after the input spec. | ['all'] |
--vars |
List[string] | A space-delimited list of variables to override in the spec file. This list should be given after the spec file is provided. Ex: --vars QUEUE_NAME=new_queue_name |
None |
Examples:
Check the Status of Queues in a Spec File
This is the same as running merlin status <spec_file>
prior to Merlin v1.12.0
Check the Status of Queues for Specific Steps
Status (merlin status
)
Note
To obtain the same functionality as the merlin status
command prior to Merlin v1.12.0 use merlin queue-info
with the --spec
option:
Display a high-level status summary of a workflow.
This will display the progress of each step in your workflow using progress bars and brief summaries. In each summary you can find how many tasks there are in total for a step, how many tasks are in each state, the average run time and standard deviation of run times of the tasks in the step, the task queue, and the worker that is watching the step.
For more information, see the Status documentation.
Usage:
Options:
Name | Type | Description | Default |
---|---|---|---|
-h , --help |
boolean | Show this help message and exit | False |
--cb-help |
boolean | Colorblind help option. This will utilize different symbols for each state of a task. | False |
--dump |
filename | The name of a csv or json file to dump the status to | None |
--no-prompts |
boolean | Ignore any prompts provided to the command line. This will default to the latest study if you provide a spec file rather than a study workspace. | False |
--task_server |
string | Task server type. Currently only "celery" is implemented. | "celery" |
-o , --output-path |
dirname | Specify a location to look for output workspaces. Only used when a spec file is passed as the argument to status . |
None |
Examples:
Check the Status Using a Specification as Input
This will look in the OUTPUT_PATH
Reserved Variable defined within the spec file to try to find existing workspace directories associated with this spec file. If more than one are found, a prompt will be displayed for you to select a workspace directory.
Check the Status Using a Specification as Input & Ignore Any Prompts
If multiple workspace directories associated with the spec file provided are found, the --no-prompts
option will ignore the prompt and select the most recent study that was ran based on the timestamps.
Dump the Status Report to a CSV File