Pipeline "Description" should have an option to render the repo README.md | Voters

Pipeline "Description" should have an option to render the repo README.md

in progress

Charcoal Mandrill

Currently in the Launchpad, when you register a pipeline, there is a

Description

box. We end up having to write a custom description for every pipeline for the users to read. However, we already have a

README.md

in every pipeline repo with similar details, and we even have external pipeline docs too. So being forced to re-write a static description here ends up with duplicated docs that can also get outdated easily.

I think it would be easier if there was some way that the Launchpad could (optionally?) use the contents of the pipeline repo's

README.md

file and render it in the

Description

field instead. So we could just updated the repo's files itself and they would be reflected in the Launchpad, somehow. Or maybe the pipeline

README.md

could be displayed in a separate box on the pipeline's Launchpad page?

September 19, 2024

Rob Newman

marked this post as

in progress

Rob Newman

Hi Charcoal Mandrill - thanks for submitting this feature request. We have several approaches that we can take with this, which are dependent on the size of the

README.md

(for example

sarek

has a 14KB file and

rnaseq

has a 12KB file):

If the file is small enough (<1KB or <100 lines), render the whole content while strip out embedded or linked content (eg. images)
If the file is above the defined threshold:

Define custom
start
and
end
comments (eg.

;

) and only extract that content and render in the
Description
box.
Truncate the
README.md
content to the threshold supported and then diffuse any additional text with a link to the repo itself.

Other options?

If you could describe the typical size and content of the

README.md

files you want us to support, it would help us refine further. Thanks.

Adam Talbot

User provided sections are probably the best way, but it gives quite a bit of overhead to the developer which is just extra "stuff" they have to do.

I would try and read the first section of a README.md, ignoring images, badges and headers. This is likely to be the "about" section of a pipeline which should populate most of the information. If you wanted to be really clever, you could do a hierarchy:

user provided on platform description
custom start and end comments
top paragraph of repo README
empty

There is 1 more option. Use Seqera AI to summarise the description. Here is an example for Sarek after a prompt to write a 512 character description:

nf-core/sarek is a comprehensive bioinformatics workflow for detecting variants in whole genome or targeted sequencing data. It supports various species and data types, including tumor/normal pairs. The pipeline covers functionalities from quality control and read mapping to variant calling, filtering, and annotation. It integrates multiple tools, allowing users to choose methods for their needs. Key features include UMI processing, QC, mapping, BAM processing, variant calling (using tools like ASCAT, CNVkit, DeepVariant, GATK), filtering, annotation, and QC summarization. Sarek provides a robust, flexible solution for genomic variant analysis in research and clinical settings.

Ken Brewer

Rob Newman I like the combination of 1 and 2 a lot. That seems like it would solve 95% of use cases. 
In regards to the custom start and end comments. I would put this as a nice-to-have addition vs the truncating README.md, which is a must-have for this feature. Although I can't make promises of nf-core, it seems to me they would probably be more willing to include those tags as a standard part of the nf-core template if they have a generic naming schema (e.g., DESCRIPTION-RENDER
 ) rather than a Seqera-specific naming schema ( PLATFORM-RENDER
 ). 
Another option that might address the same concerns is the ability to add a link to pipeline documentation. For example, for nf-core/rnaseq 3.17.0 the best option to read the docs isn't https://github.com/nf-core/rnaseq/blob/3.17.0/README.md
. It's https://nf-co.re/rnaseq/3.17.0/
. 
I like the zero-config simplicity of linking out to the README.md. But maybe a docs link could be a separate thing worth including at the same time?

Ken Brewer

Adam Talbot This is a pretty clever use of SeqeraAI. Maybe it could be turned into a "Magic" button on the "Edit Pipeline" page similar to how shopify helps fill out product descriptions for online stores.

CC Sasha Dagayev

Charcoal Mandrill

Rob NewmanI think many of our internal custom README's are in the ballpark of ~200 lines and ~5KB but I am not sure if that is good for this purpose or not. Usually when there is some line count or byte size limit imposed on these front-end UI elements, we end up hitting them eventually.

Rob Newman

Charcoal Mandrill: Thanks for the additional info. That may be small enough for us to retrieve and parse. Do they typically have other elements than text, such as images, links, etc? Are they just markdown format?

Rob Newman

marked this post as

planned

Rob Newman

marked this post as

evaluating

Rob Newman

marked this post as

acknowledged