Seqera Platform Feature Requests

Anonymous

Feature requests for the Seqera Platform (https://cloud.seqera.io)
Datasets improvements (M1): User experience improvements
Datasets within Seqera Platform facilitate structured handling of input sample sheets required for genomics pipelines such as RNA-seq. Currently, researchers often face friction in assembling these datasets, needing to prepare CSV files externally. This could be improved by the following: 1. Remove/Increase Dataset limit: Users are currently limited to 1000 datasets per workspace. Action: Raise or remove limit 2. Improved Dataset listing: Currently, the dataset listing is relatively basic and low density. Around 6 datasets can be viewed in a typical desktop browser window size. This density plus lack of tooling hinders users from effectively navigating or identifying datasets quickly. Action: A metadata-rich table view, such as those used on the Runs page and elsewhere would be preferable. Such a table could include the dataset name, number of rows, author, creation date, last used date, and potentially the start of the description. The table will be sortable. 2. Dataset creation from Pipeline Launch UI: Currently, creating datasets requires users to navigate away from the pipeline launch page, disrupting workflow and causing friction. Action: Integrate dataset creation directly within the pipeline launch interface, users can seamlessly upload or enter sample data without leaving their primary task. 3. Enhanced Dataset details user interface: Currently, viewing dataset details emphasizes metadata over actual dataset content, causing users unnecessary scrolling and inefficiency. Action: Prioritizing the dataset’s actual content prominently at the top ensures users quickly verify the dataset's accuracy and completeness, reducing errors and improving productivity. 4. "Archive" or "deactivate" a Dataset: Datasets are often used once or twice, and then no longer actively needed. For GxP/clinical environments, the dataset should not be deleted/removed, but made inactive/disabled/"archived"/"deactivated". This allows inspection of the dataset, but it would be excluded from any pipeline launches. The table of datasets can be filtered to remove inactive/disabled/"archived"/"deactivated" entries. Action: Allow datasets to be tagged as “inactive/disabled/archived/deactivated” and allow filtering of datasets to show/hide archived entries. 5. Import or link a Dataset from a URL: Datasets are becoming increasingly available via URL links. Currently, users are forced to fetch locally and add them, adding friction and wasting storage, and the original reference of the source is lost. Action: Supporting direct URL import/linking. 6. Keep a record of Dataset usage in Runs: Currently, it’s difficult / impossible to know if a dataset has ever been used. This makes their utility post-usage very limited. Action: With a record of Dataset usage within Run history, Datasets suddenly become a powerful tool for the user. They act as a rich history of run inputs, agnostic to the specifics of pipeline design and file usage. 7. Improve how Dataset versioning works: A user should be able to choose any dataset and version as the source of a pipeline run, and that dataset and version is displayed in the pipeline Run details page in the “Datasets” tab correctly. Additional potential milestones Integration with new Nextflow data lineage
10
·

planned