Display task hash calculation inputs in Platform | Voters

Display task hash calculation inputs in Platform

acknowledged

Rob Syme

If, during a resumed run, a task is submitted for execution rather than picked up from the previous run's cache, it is challenging to know why Nextflow's task calculation changed between runs.
Nextflow's cache mechanism works by creating a unique fingerprint for each task - a hash calculated by hashing the inputs to the task, the process name, the script block, etc and then hashing those hashes.
Nextflow has the dumpHashes = true 
configuration option which prints each final task hash and each of the component inputs. These hashes allow users to identify which of the task inputs has changed between runs by comparing the hashes. 
Unfortunately, this is turned off by default, so if a cache miss occurs, the user has to re-run the workflow two more times with the dumpHashes = true
 configuration enabled (so at least four runs in total) to get two runs where the hashes can be compared.
It would be a huge time saver if Nextflow could report the hashes to Platform on every run, and Platform include those hash details in the task details modal. This would make cache misses much more actionable and less opaque.

April 2, 2026

Rob Newman

marked this post as

acknowledged