expansion
This module handles the expansion of variables in a Merlin spec file.
It provides functionality to expand user-defined variables, environment variables, and reserved variables within a spec file. The module also supports variable substitution for specific use cases, such as parameter substitutions for samples and commands, and allows for the processing of override variables provided via the command-line interface.
determine_user_variables(*user_var_dicts)
Determine user-defined variables from multiple dictionaries.
This function takes an arbitrary number of dictionaries containing user-defined variables and resolves them in order, handling variable references and expansions (e.g., environment variables, user home directory shortcuts). Variable names are converted to uppercase, and reserved words cannot be reassigned.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
user_var_dicts
|
List[Dict]
|
One or more dictionaries of user variables. Each dictionary contains key-value pairs where the key is the variable name, and the value is the variable's definition. |
()
|
Returns:
| Type | Description |
|---|---|
Dict
|
A dictionary of resolved user variables, with variable names in uppercase and all references expanded. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If a reserved word is attempted to be reassigned. |
Example
>>> user_vars_1 = {'OUTPUT_PATH': './studies', 'N_SAMPLES': 10}
>>> user_vars_2 = {'TARGET': 'target_dir', 'PATH': '$(SPECROOT)/$(TARGET)'}
>>> determine_user_variables(user_vars_1, user_vars_2)
{'OUTPUT_PATH': './studies', 'N_SAMPLES': '10',
'TARGET': 'target_dir', 'PATH': '$(SPECROOT)/target_dir'}
Source code in merlin/spec/expansion.py
expand_by_line(text, var_dict)
Expand variables in a text line by line.
This function processes a multi-line text (e.g., a YAML specification) and replaces variable references in each line using a provided dictionary of variable names and their corresponding values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The input multi-line text to process. |
required |
var_dict
|
Dict[str, str]
|
A dictionary of variable names and their corresponding values to substitute in the text. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The text with all applicable variable substitutions applied, processed line by line. |
Example
Source code in merlin/spec/expansion.py
expand_env_vars(spec)
Expand environment variables in all sections of a spec.
This function processes all sections of a given spec object and expands
environment variables (e.g., $HOME or ~) in string values. It skips
expansion for values associated with the keys 'cmd' or 'restart', as these
are typically shell scripts where environment variable expansion would
already occur during execution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
MerlinSpec
|
The spec object containing sections to process. |
required |
Returns:
| Type | Description |
|---|---|
MerlinSpec
|
The updated spec object with environment variables expanded in all applicable sections. |
Source code in merlin/spec/expansion.py
expand_line(line, var_dict, env_vars=False)
Expand a single line of text by substituting variables.
This function replaces variable references in a given line of text with
their corresponding values from a provided dictionary. Optionally, it can
also expand environment variables and user home directory shortcuts (e.g., ~).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
line
|
str
|
The input line of text to expand. |
required |
var_dict
|
Dict[str, str]
|
A dictionary of variable names and their corresponding values to substitute in the line. |
required |
env_vars
|
bool
|
If True, environment variables and home directory shortcuts will also be expanded. |
False
|
Returns:
| Type | Description |
|---|---|
str
|
The expanded line of text with all applicable substitutions applied. |
Example
Source code in merlin/spec/expansion.py
expand_spec_no_study(filepath, override_vars=None)
Get the expanded text of a spec without creating a MerlinStudy. Expansion is limited to user variables (the ones defined inside the yaml spec or at the command line).
Expand a spec without creating a MerlinStudy.
This function processes a spec file to expand user-defined variables (those defined
in the YAML spec or provided via override_vars) without creating a MerlinStudy
object. It returns the expanded text of the specification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str
|
The path to the YAML specification file. |
required |
override_vars
|
Dict[str, str]
|
A dictionary of variable overrides to apply during the expansion. These overrides replace or supplement the variables defined in the spec. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
The expanded YAML specification as a string, with user-defined variables resolved. |
Source code in merlin/spec/expansion.py
get_spec_with_expansion(filepath, override_vars=None)
Load and expand a Merlin YAML specification with overrides, without creating a
MerlinStudy object.
This function returns a MerlinSpec object with
variables expanded and overrides applied. It processes the YAML specification file
and resolves user-defined variables without creating a MerlinStudy object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str
|
The path to the YAML specification file. |
required |
override_vars
|
Dict[str, str]
|
A dictionary of variable overrides to apply during the expansion. These overrides replace or supplement the variables defined in the YAML spec. |
None
|
Returns:
| Type | Description |
|---|---|
MerlinSpec
|
A |
Source code in merlin/spec/expansion.py
parameter_substitutions_for_cmd(glob_path, sample_paths)
Generate parameter substitutions for a Merlin command.
This function creates a list of substitution pairs for a Merlin command, mapping variable references to their corresponding values. It also includes predefined return codes for various Merlin states.
Substitutions
$(MERLIN_GLOB_PATH): The providedglob_path.$(MERLIN_PATHS_ALL): The providedsample_paths.$(MERLIN_SUCCESS): The return code for a successful operation.$(MERLIN_RESTART): The return code for a restart operation.$(MERLIN_SOFT_FAIL): The return code for a soft failure.$(MERLIN_HARD_FAIL): The return code for a hard failure.$(MERLIN_RETRY): The return code for a retry operation.$(MERLIN_STOP_WORKERS): The return code for stopping workers.$(MERLIN_RAISE_ERROR): The return code for raising an error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
glob_path
|
str
|
A glob pattern that yields the paths to all Merlin samples. |
required |
sample_paths
|
str
|
A delimited string containing the paths to all samples. |
required |
Returns:
| Type | Description |
|---|---|
List[Tuple[str, str]]
|
A list of tuples, where each tuple contains a variable reference and its corresponding value. |
Example
>>> glob_path = "/path/to/samples/*"
>>> sample_paths = "/path/to/sample1:/path/to/sample2"
>>> parameter_substitutions_for_cmd(glob_path, sample_paths)
[
("$(MERLIN_GLOB_PATH)", "/path/to/samples/*"),
("$(MERLIN_PATHS_ALL)", "/path/to/sample1:/path/to/sample2"),
("$(MERLIN_SUCCESS)", "0"),
("$(MERLIN_RESTART)", "100"),
("$(MERLIN_SOFT_FAIL)", "101"),
("$(MERLIN_HARD_FAIL)", "102"),
("$(MERLIN_RETRY)", "104"),
("$(MERLIN_STOP_WORKERS)", "105"),
("$(MERLIN_RAISE_ERROR)", "106")
]
Source code in merlin/spec/expansion.py
parameter_substitutions_for_sample(sample, labels, sample_id, relative_path_to_sample)
Generate parameter substitutions for a specific sample.
This function creates a list of substitution pairs for a given sample,
mapping variable references (e.g., $(LABEL)) to their corresponding
values in the sample. It also includes metadata substitutions such as
the sample ID and the relative path to the sample.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample
|
List[str]
|
A list of values representing the sample. |
required |
labels
|
List[str]
|
A list of column labels corresponding to the sample values. |
required |
sample_id
|
int
|
The unique integer ID of the sample. |
required |
relative_path_to_sample
|
str
|
The relative path to the sample. |
required |
Returns:
| Type | Description |
|---|---|
List[Tuple[str, str]]
|
A list of tuples, where each tuple contains a variable reference
(e.g., |
Example
>>> sample = [10, 20]
>>> labels = ["X", "Y"]
>>> sample_id = 0
>>> relative_path_to_sample = "/0/3/4/8/9/"
>>> parameter_substitutions_for_sample(sample, labels, sample_id, relative_path_to_sample)
[
("$(X)", "10"),
("$(Y)", "20"),
("$(MERLIN_SAMPLE_ID)", "0"),
("$(MERLIN_SAMPLE_PATH)", "/0/3/4/8/9/")
]
Source code in merlin/spec/expansion.py
var_ref(string)
Format a string as a variable reference.
This function takes a string, converts it to uppercase, and returns it
wrapped in the format $(<string>). If the string already contains
a token (e.g., it is already formatted as a variable reference), a warning
is logged and the original string is returned unchanged.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
string
|
str
|
The input string to format as a variable reference. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The formatted variable reference, or the original string if it already contains a token. |