Variables

There are a number of variables which can be placed in a merlin input .yaml file that can control workflow execution, such as via string expansion and control flow.

Note

Only user variables and OUTPUT_PATH may be reassigned or overridden from the command line.

Directory structure context

The directory structure of merlin output looks like this:

SPECROOT
    <spec.yaml>

...

OUTPUT_PATH
    MERLIN_WORKSPACE
        MERLIN_INFO
            <name>.orig.yaml
            <name>.partial.yaml
            <name>.expanded.yaml
        <step_name>.workspace
        WORKSPACE

Reserved variables

Study variables that Merlin uses. May be referenced within a specification file, but not reassigned or overridden.

Variable

Description

Example Expansion

$(SPECROOT)

Directory path of the specification file.

/globalfs/user/merlin_workflows

$(OUTPUT_PATH)

Directory path the study output will be written to. If not defined will default to the current working directory. May be reassigned or overridden.

./studies

$(MERLIN_TIMESTAMP)

The time a study began. May be used as a unique identifier.

"YYYYMMDD-HHMMSS"

$(MERLIN_WORKSPACE)

Output directory generated by a study at OUTPUT_PATH. Ends with MERLIN_TIMESTAMP.

$(OUTPUT_PATH)/ensemble_name_$(MERLIN_TIMESTAMP)

$(WORKSPACE)

The workspace directory for a single step.

$(OUTPUT_PATH)/ensemble_name_$(MERLIN_TIMESTAMP)/step_name/``

$(MERLIN_INFO)

Directory within MERLIN_WORKSPACE that holds the provenance specs and sample generation results. Commonly used to hold samples.npy.

$(MERLIN_WORKSPACE)/merlin_info/

$(MERLIN_SAMPLE_ID)

Sample index in an ensemble

0 1 2 3

$(MERLIN_SAMPLE_PATH)

Path in the sample directory tree to a sample’s directory, i.e. where the task is actually run.

/0/0/0/ /0/0/1/ /0/0/2/ /0/0/3/

$(MERLIN_GLOB_PATH)

All of the directories in a simulation tree as a glob (*) string

/*/*/*/*

$(MERLIN_PATHS_ALL)

A space delimited string of all of the paths; can be used as is in bash for loop for instance with:

for path in $(MERLIN_PATHS_ALL)
  do
    ls $path
  done
0/0/0
0/0/1
0/0/2
0/0/3

$(MERLIN_SAMPLE_VECTOR)

Vector of merlin sample values

$(SAMPLE_COLUMN_1) $(SAMPLE_COLUMN_2) ...

$(MERLIN_SAMPLE_NAMES)

Names of merlin sample values

SAMPLE_COLUMN_1 SAMPLE_COLUMN_2 ...

$(MERLIN_SPEC_ORIGINAL_TEMPLATE)

Copy of original yaml file passed to merlin run.

$(MERLIN_INFO)/*.orig.yaml

$(MERLIN_SPEC_EXECUTED_RUN)

Parsed and processed yaml file with command-line variable substitutions included.

$(MERLIN_INFO)/*.partial.yaml

$(MERLIN_SPEC_ARCHIVED_COPY)

Archive version of MERLIN_SPEC_EXECUTED_RUN with all variables and paths fully resolved.

$(MERLIN_INFO)/*.expanded.yaml

The LAUNCHER and VLAUNCHER Variables

$(LAUNCHER) is a special case of a reserved variable since it’s value can be changed. It serves as an abstraction to launch a job with parallel schedulers like slurm, lsf, and flux and it can be used within a step command. For example, say we start with this run cmd inside our step:

run:
    cmd: srun -N 1 -n 3 python script.py

We can modify this to use the $(LAUNCHER) variable like so:

batch:
    type: slurm

run:
    cmd: $(LAUNCHER) python script.py
    nodes: 1
    procs: 3

In other words, the $(LAUNCHER) variable would become srun -N 1 -n 3.

Similarly, the $(VLAUNCHER) variable behaves similarly to the $(LAUNCHER) variable. The key distinction lies in its source of information. Instead of drawing certain configuration options from the run section of a step, it retrieves specific shell variables. These shell variables are automatically generated by Merlin when you include the $(VLAUNCHER) variable in a step command, but they can also be customized by the user. Currently, the following shell variables are:

VLAUNCHER Variables

Variable

Description

Default

${MERLIN_NODES}

The number of nodes

1

${MERLIN_PROCS}

The number of tasks/procs

1

${MERLIN_CORES}

The number of cores per task/proc

1

${MERLIN_GPUS}

The number of gpus per task/proc

0

Let’s say we have the following defined in our yaml file:

batch:
    type: flux

run:
    cmd: |
      MERLIN_NODES=4
      MERLIN_PROCS=2
      MERLIN_CORES=8
      MERLIN_GPUS=2
      $(VLAUNCHER) python script.py

The $(VLAUNCHER) variable would be substituted to flux run -N 4 -n 2 -c 8 -g 2.

User variables

Variables defined by a specification file in the env section, as in this example:

env:
    variables:
        ID: 42
        EXAMPLE_VAR:    hello

As long as they’re defined in order, you can nest user variables like this:

env:
    variables:
        EXAMPLE_VAR:    hello
        WORKER_NAME: $(EXAMPLE_VAR)_worker

Like all other Merlin variables, user variables may be used anywhere (as a yaml key or value) within a specification as below:

cmd: echo "$(EXAMPLE_VAR), world!"
...
$(WORKER_NAME):
    args: ...

If you want to programmatically define the study name, you can include variables in the description.name field as long as it makes a valid filename:

description:
    name: my_$(EXAMPLE_VAR)_study_$(ID)
    description: example of programmatic study name

The above would produce a study called my_hello_study_42.

Environment variables

Merlin expands Unix environment variables for you. The values of the user variables below would be expanded:

env:
    variables:
        MY_HOME: ~/
        MY_PATH: $PATH
        USERNAME: ${USER}

However, Merlin leaves environment variables found in shell scripts (think cmd and restart) alone. So this step:

- name: step1
  description: an example
  run:
    cmd: echo $PATH ; echo $(MY_PATH)

…would be expanded as:

- name: step1
  description: an example
  run:
    cmd: echo $PATH ; echo /an/example/:/path/string/

Step return variables

Special return code variables for task steps.

Variable

Description

Example Usage

$(MERLIN_SUCCESS)

This step was successful. Keep going to the next task. Default step behavior if no exit code given.

echo "hello, world!"
exit $(MERLIN_SUCCESS)

$(MERLIN_RESTART)

Run this step’s restart command, or re-run cmd if restart is absent. The default maximum number of retries+restarts for any given step is 30. You can override this by adding a max_retries field under the run field in the specification. Issues a warning. Default will retry in 1 second. To override the delay time, specify retry_delay.

run:
  cmd: |
     touch my_file.txt
     echo "hi mom!" >> my_file.txt
     exit $(MERLIN_RESTART)
  restart: |
     echo "bye, mom!" >> my_file.txt
  max_retries: 23
  retry_delay: 10

$(MERLIN_RETRY)

Retry this step’s cmd command. The default maximum number of retries for any given step is 30. You can override this by adding a max_retries field under the run field in the specification. Issues a warning. Default will retry in 1 second. To override the delay time, specify retry_delay.

run:
  cmd: |
     touch my_file.txt
     echo "hi mom!" >> my_file.txt
     exit $(MERLIN_RETRY)
  max_retries: 23
  retry_delay: 10

$(MERLIN_SOFT_FAIL)

Mark this step as a failure, note in the warning log but keep going. Unknown return codes get translated to soft fails, so that they can be logged.

echo "Uh-oh, this sample didn't work"
exit $(MERLIN_SOFT_FAIL)

$(MERLIN_HARD_FAIL)

Something went terribly wrong and I need to stop the whole workflow. Raises a HardFailException and stops all workers connected to that step. Workers will stop after a 60 second delay to allow the step to be acknowledged by the server.

Note

Workers in isolated parts of the workflow not consuming from the bad step will continue. You can stop all workers with $(MERLIN_STOP_WORKERS).

echo "Oh no, we've created skynet! Abort!"
exit $(MERLIN_HARD_FAIL)

$(MERLIN_STOP_WORKERS)

Launch a task to stop all active workers. To allow the current task to finish and acknowledge the results to the server, will happen in 60 seconds.

# send a signal to all workers to stop
exit $(MERLIN_STOP_WORKERS)