Selectively enable auto-populating Data Explorer for specific Organizations and Workspaces
evaluating
Y
Yellow sunshine Firefly
Currently there is an option to disable data explorer completely or there is an option to selectively disable auto populating of buckets on specifically specified workspaces.
It would be beneficial if we could have the the option to reverse that: globally turn off auto populating of buckets except for specific workspaces or organizations.
Rob Newman
Merged in a post:
Better management of automatic cloud vs. custom data links
C
Coral reef Lamprey
We have a relatively complex Seqera Platform/Tower installation and manage our system via the API. While there are places we'd like to provide people the ability to browse specific buckets within SP/Tower (Task workspace buckets, input & output buckets, and some other niceties), we definitely don't want to provide open access (including uploads) to every bucket available within our workspaces.
We're exploring what's in the UI, what's in the configuration, and what's in the API & DB, and we're finding some pretty weird things re: our ability to automatically / programmatically manage which buckets show up in the data explorer.
We'd really like to encourage maybe flattening the overall product surface here so there's not a bifurcation between what we add and what SP/Tower automatically adds based on its own logic, and also maybe some additional tunables around which buckets can be uploaded to, etc.
The data explorer is great, the magic parts are not.
C
Charcoal Mandrill
+1, "me too"
> We'd prefer to manage which cloud storage buckets show up in Data Explorer manually.
This. It is causing seriously huge problems for us that not only is Data Explorer sucking in access to all S3 buckets in the AWS account from the "Data Explorer" page in the Workspace, but it also gives the same global access to all these buckets from within the Run's page when you click in to a specific Tasks and click into the tab showing the files in the S3 workdir for that task ; you are able to freely navigate up from the Run's workdir location and then browse the entire S3 bucket and jump over to other S3 buckets in the AWS account.
Trying to wrangle this from IAM's on the AWS side has been difficult and confusing.
The entire situation could be avoided if we could have just configured Data Explorer on a per-workspace level with only the buckets it should show. Bonus: also configure Data Explorer for the allowed bucket subdirs as well. For example, Allow
s3://my-bucket/seqera/* ; s3://my-bucket/users/*
but block s3://my-bucket/secret-projects/*
. Rob Newman
Rob Newman
Merged in a post:
Enable `TOWER_DATA_EXPLORER_CLOUD_DISABLED_WORKSPACES` for all workspaces
C
Coral reef Lamprey
We'd prefer to manage which cloud storage buckets show up in Data Explorer manually. We also have a large and dynamic number of Workspaces, and would prefer to just disable the automatic cloud fetch globally.
Rob Newman
Adding functionality to disable or enable automatic cloud data-links per workspace via Workspace settings (or via an API endpoint) is in the Engineering backlog.
Rob Newman
Thanks for the feedback, Eric. Data Explorer is still in public-preview and we're actively improving and standardizing the UI and API to be consistent so your input is very valuable.
Currently, only workspace users with the
Maintain
role and above can upload, download and preview files in the Data Explorer, so you can use role assignments to determine which of your users can access cloud storage buckets attached to your workspaces.Data Explorer currently lists all buckets accessible to your workspace cloud credentials. To make this more limited across a workspace, you can modify your workspace credentials to be more strict, and then have a
Maintain
role+ workspace user select the Add cloud bucket
feature and manually manage cloud storage buckets and custom data-links. (Note that you can also do this programmatically via a series of GET/POST/PUT/DELETE
API requests to the data-links
endpoint but I'm aware that we need to update our API docs to include this endpoint)Configuration-wise, you can disable automatic cloud bucket retrieval per workspace by using comma-separated workspace IDs in the
TOWER_DATA_EXPLORER_CLOUD_DISABLED_WORKSPACES
environment variable or defined in tower.yml
file. Docs.I think it would be helpful for me to better understand the "weird things" your team are encountering by jumping on a call. I'll reach out via email today.
Rob Newman
evaluating
Rob Newman
under review
Rob Newman
Thank you for your feedback. We are currently reviewing the mechanism for how Data Explorer displays cloud storage buckets at both the organization and workspace levels.