Support for a distinct set of variables in the config / param-file / launch dir / work-dir paths
planned
M
Mighty Bird
Our primary use case for the Seqera Platform involves one user setting up a workspace with preconfigured pipelines, accessible to others in the organization. We aim to preconfigure pipelines to a large extent, yet allowing for minor, user-specific or pipeline-specific customizations.
If variables could be included in paths, which are evaluated at pipeline launch, a tidier directory structure accommodating many different users and pipelines would be possible.
Therefore, I propose that a distinct set of variables should be supported by Seqera Platform/Tower:
username
, pipelinename
, pipelinerevision
, workspacename
and date
come to my mind. All are "known" to the Platform when launching a pipeline, but may not exist yet. Therefore, the respective paths must be created automatically by the agent, if not available on the HPC / cloud provider. For instance, a work directory could be preconfigured as:
/path/to/a/work/directory/{{username}}/{{pipelinename}}/{{runname}}
Parameter files could be specific to a pipeline revision:
/path/to/a/params/{{pipelinename}}/params-{{pipelinerevision}}.json
and account for changing parameters.
This would streamline the management of configurations, parameters, work directories, launch directories, and outputs across numerous users and pipeline runs.
Rob Newman
planned
Update: this is currently planned for Q4 2024/Q1 2025
X
Xenia silver Puma
That kind of feature would be useful for me too. Additionally, it would be nice things like $USER (from a local VM) could default to {{username}} in Seqera (or vice-versa). That way, a single configuration could be both used from the Seqera platform or locally.
Currently, having the $USER global variable in workDir works fine outside of Seqera, but errors with the Seqera platform.
Rob Newman
Continuous Squid - I merged your new feature request in with this existing one.
Rob Newman
Merged in a post:
Allow to use variables (e.g. ${USER}) as a work directory in compute environments.
C
Continuous Squid
I would like to be able to use variables (for instance ${USER})) to define the work directory in compute environment (maybe also for pipelines).
I did some tests: I tried to setup my compute environment like this:
- Work directory: /scratch/seqera-work-${USER}
- Lauch directory: /scratch/seqera-work-${USER}
It creates two folders with the correct username (/scratch/seqera-work-user1 and /scratch/seqera-work-user1) which remains empty, and two folders "/scratch/seqera-work-${USER}" and "/scratch/seqera-work-${USER}" (${USER} is not interpreted}) which are actually used.
This feature would be useful for managed identities: The problem with managed identites is that all users should use the same work folder, and should be all allowed to access the subfolders. This is a security problem, as this means that everyone can see all the files generated.
Drew DiPalma
Merged in a post:
Ability to pass data from the OIDC user/token info into the compute/pipeline configuration
F
Functional Lemur
We have a lot of useful user-data configuration stored in Okta, which is "automatically" available when the user logs in. Some user-data is actually critical and needed if the user wants to access data in the DataHub. We would therefore like to be able to pass the OIDC user-info and decoded token info into the configuration of a compute environment or the pipeline launch.
An acceptable solution/first step would be to have the
userinfo
values available as environment variables, i.e. being available for the pre or post run scripts. As an example, let's imagine that this is the
userinfo
for an authenticated user richard
: roles:
basic:
- A
- B
- C
advanced:
- admin
name: Richard N.
then this could be translated, for example, into:
```
OIDC_USERINFO_ROLES_BASIC_0=A OIDC_USERINFO_ROLES_BASIC_1=B OIDC_USERINFO_ROLES_BASIC_2=C OIDC_USERINFO_ROLES_ADVANCED_0=admin OIDC_USERINFO_NAME='Richard N.'
```
There are of course other solutions for this problem, but this would be sufficient for our use case.
This issue is pretty important in order to rollout the Seqera Platform into our entire organization as the majority of the data access patterns come from the OIDC info.
Rob Newman
Merged in a post:
Environment variables for launching in compute environments
Mattia
The current compute environment creation process uses a fixed form, limiting the ability to customize elements like the work directory. Since the fields are set only once at creation certain patterns, such as including variables in the work directory path based on user identity, aren't possible.
Introducing support for parametric variables in properties fields would enable variables (such as userId, userName, workflowId, and runName) and allow users to dynamically specify values, for example setting the working directory based on the user launching the workflow. This would provide Seqera Platform users with more adaptable and personalized compute environment creation.
Rob Newman
acknowledged
Rob Newman
Merged in a post:
Allow use of tokens in workflow submissions
Jon Manning
It would be useful if there were a set of tokens available in the Seqera Platform user interface. The motivating use case for this is to use the workflow run name (e.g.
kickass_stonebraker
) in the output
and workdir
paths, like:s3://nf-tower-outputs/fetchngs/${workflow_run_name}/
Or the submission time:
s3://nf-tower-outputs/fetchngs/${workflow_submission_time}/