You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per-job conda environments are built regardless of auto_install whenever one or more deps exist but a mulled env does not exist for multi-requirement tools
#13711
Open
natefoo opened this issue
Apr 12, 2022
· 3 comments
Describe the bug
If no requirements for a tool are resolved through conda, that is considered a resolution miss and the resolution process continues with the next resolver. However, if any requirement is found via conda for a tool with multiple requirements and there is not a mulled environment present, Galaxy will attempt to create a per-job environment for that tool, regardless of whether auto_install is false. As far as I can tell, this cannot be disabled.
In the past I'd make sure I had mulled envs in CVMFS for every tool, but now that we've got all new tool versions using Singularity, we're running in to cases where a tool has a matching biocontainers image, but some of its requirements (but not the correct mulled env) might also be installed individually through conda. On the Galaxy side this is ok as long as you explicitly send those tools to destinations where normal dependency resolution is disabled, but it breaks on the Pulsar side unless you set dependency_resolution on that destination (in the Galaxy job conf) to local.
So there are some workarounds if you're aware what's going on, but... I am not a big fan of per-job environments and would be happy to see support for them removed. I don't know if people have had success with them, but basically my issues with them are:
Incredibly slow env creation on a per-job basis, there is essentially no caching or reuse.
You are exporting the list of packages and specific versions in each of the deps that are installed to install them in the per-job environment, but those deps at those versions may no longer be installable.
If you're lucky, things "appear to work" to the user/admin until one day the requirements can no longer be solved or downloaded, at which point the tool is broken and very difficult to fix.
IMO we are much better off just dropping them entirely. If you have auto_install true, it should attempt to install a mulled env. If not, it should fail resolution and (potentially) fail the entire tool.
Galaxy Version and/or server at which you observed the bug
all
To Reproduce
Steps to reproduce the behavior:
Install a tool with multiple requirements with a conda resolver enabled, but do not install the requirements.
Run the tool, notice that nothing resolves but the tool executes (and fails if things needed aren't on $PATH).
Install one, but not all (mulled) of the dependencies.
Run the tool, notice that a per-job environment is created.
Expected behavior
If nothing else, auto_install is respected or a new option can be added to control this behavior.
The text was updated successfully, but these errors were encountered:
IMO we are much better off just dropping them entirely. If you have auto_install true, it should attempt to install a mulled env. If not, it should fail resolution and (potentially) fail the entire tool.
This has showed up on Galaxy Australia a few times, most recently with dbbuilder galaxyproteomics/tools-galaxyp#739. I can't see what the use case would be for per-job conda environments. I'd prefer the jobs to fail so that we can sort out an appropriate environment/container for the tool.
Describe the bug
If no requirements for a tool are resolved through conda, that is considered a resolution miss and the resolution process continues with the next resolver. However, if any requirement is found via conda for a tool with multiple requirements and there is not a mulled environment present, Galaxy will attempt to create a per-job environment for that tool, regardless of whether
auto_install
isfalse
. As far as I can tell, this cannot be disabled.In the past I'd make sure I had mulled envs in CVMFS for every tool, but now that we've got all new tool versions using Singularity, we're running in to cases where a tool has a matching biocontainers image, but some of its requirements (but not the correct mulled env) might also be installed individually through conda. On the Galaxy side this is ok as long as you explicitly send those tools to destinations where normal dependency resolution is disabled, but it breaks on the Pulsar side unless you set
dependency_resolution
on that destination (in the Galaxy job conf) tolocal
.So there are some workarounds if you're aware what's going on, but... I am not a big fan of per-job environments and would be happy to see support for them removed. I don't know if people have had success with them, but basically my issues with them are:
IMO we are much better off just dropping them entirely. If you have
auto_install
true, it should attempt to install a mulled env. If not, it should fail resolution and (potentially) fail the entire tool.Galaxy Version and/or server at which you observed the bug
all
To Reproduce
Steps to reproduce the behavior:
$PATH
).Expected behavior
If nothing else,
auto_install
is respected or a new option can be added to control this behavior.The text was updated successfully, but these errors were encountered: