Add instance ID to Nextflow & Seqera Platform reports
acknowledged
T
Tangerine yellow Ladybug
We would like to have the Instance ID in the report details. I would assume that the Seqera Platform already gets the Instance ID when looking up the ECS Instances to get pricing.
Having Instance ID in this view would allow us to troubleshoot issues. Currently when we get an "Out Of Space" notification, we cannot correlate that with the logs of a terminated instance to further look into the issue. We have to relaunch and hope to catch the issues in the act to find the instance to get logs from.
C
Charcoal Mandrill
I did some investigation into this, since we have needed this feature for a long time, and it seems like a basic implementation of this is actually more trivial than I expected
This is a simple bash snippet that you can run inside a Nextflow job running in AWS Batch to retrieve almost all of the information that is being asked for in this feature request. This takes advantage of some default AWS Batch env vars, and the AWS IMDS service available in standard EC2 based jobs.
# check if we are running in AWS Batch
if [ -n "\${AWS_BATCH_JOB_ID+1}" ]; then
echo "Job ID: \${AWS_BATCH_JOB_ID:-none}"
echo "Queue: \${AWS_BATCH_JOB_QUEUE:-\${AWS_BATCH_JQ_NAME:-none}}"
echo "Compute Environment: \${AWS_BATCH_CE_NAME:-none}"
echo "Definition: \${AWS_BATCH_JOB_DEFINITION:-none}"
echo "Attempt: \${AWS_BATCH_JOB_ATTEMPT:-none}"
aws batch describe-jobs --jobs "\$AWS_BATCH_JOB_ID" --output json > "batch.job.json"
# wrap the JSON in a HTML so Seqera Platform can load it as a Report
json2html.sh "batch.job.json" "batch.job.html"
# try to access the AWS EC2 IMDS ; does not work on Fargate instances
EC2_ID="\$(curl -s http://169.254.169.254/latest/meta-data/instance-id)"
echo "EC2 Instance: \$EC2_ID"
curl -s \${AWS_CONTAINER_METADATA_URI_V4:-\${ECS_CONTAINER_METADATA_URI_V4}} | jq . > ecs.meta.json
json2html.sh ecs.meta.json ecs.meta.html
aws sts get-caller-identity --query Account --output text > account_id.txt
else
echo "AWS_BATCH_JOB_ID is not set ; not running in AWS Batch"
fi
It seems like this could just be embedded inside the Nextflow
process.beforeScript
section if you want this to run at the start every task. Of course, if you are not using Fusion, you wont be able to see this information until the task completes. Not sure what the best way to handle that might be, could be possible to hard-code in an aws s3 cp
upload of the data files into the work dir maybe so the information could be available while the task is running. Considering this, it seems like this should have been relatively easy for Nextflow to have included itself in the .nextflow.run for collection by the parent pipeline process and thus inclusion in Seqera Platform. Or maybe its better to bundle it directly in to the Nextflow launch template.
Phil Ewels
I believe that Nextflow already reports the instance ID in trace reports - it's shown as
native_id
(docs: https://nextflow.io/docs/latest/tracing.html#trace-report)You can see this in the tasks table in Seqera Platform (see screenshot).
Is this enough Tangerine yellow Ladybug, or would you need it in the modal window as well? (Presumably not that difficult to add).
C
Charcoal Mandrill
Phil Ewels just to clarify, the ID shown here as the "native_id" in the format of "174357da-4761-44e2-bfc8-0e43c5b3979c" is not the EC2 instance ID. This appears to be the AWS Batch Job ID. Unfortunately, while that ID is helpful to an extent, its not enough to solve the problem described here, which is needing to access or investigate the EC2 instance itself for debugging etc.. Its clear that Seqera Platform must have access to the EC2 instance in order to display the other metrics such as the jobs' resource usages, but its simply not showing the EC2 instance ID.
considering the date of when this Feature Request was filed, I am not clear if this has been implemented in Platform yet or not? Its certainly been a headache when using Platform for quite some time now.
Rob Newman
acknowledged
A
Advisory Spoonbill
It's not that simple. Pricing is for a class of instances, not a specific instance. The instance ID would need to be queried for against the ECS service for every task run. To do this programmatically for every task would result in API rate limits on the account and possible job submission failures.
F
Fellow Bee
Advisory Spoonbill: Nextflow already does this
Rob Newman
Tangerine yellow Ladybug - This isn't currently available in Nextflow, but as a workaround have you tried running your pipeline with Fusion v2 to retrieve this information? Fusion allows using a S3 bucket as the pipeline scratch directory and the
.fusionfs.log
file of each task folder contains extensive details including the EC2 instance id, so you can identify which tasks run on each instance.Y
Yellow sunshine Firefly
Rob Newman: Thanks, Rob. We do not run fusion as we do not use Wave. We use our own entry point for AWS Batch and now log the instance ID as a workaround until it’s in Tower.