Per-job conda environments are built regardless of auto_install whenever one or more deps exist but a mulled env does not exist for multi-requirement tools #13711

natefoo · 2022-04-12T00:26:46Z

Describe the bug
If no requirements for a tool are resolved through conda, that is considered a resolution miss and the resolution process continues with the next resolver. However, if any requirement is found via conda for a tool with multiple requirements and there is not a mulled environment present, Galaxy will attempt to create a per-job environment for that tool, regardless of whether auto_install is false. As far as I can tell, this cannot be disabled.

In the past I'd make sure I had mulled envs in CVMFS for every tool, but now that we've got all new tool versions using Singularity, we're running in to cases where a tool has a matching biocontainers image, but some of its requirements (but not the correct mulled env) might also be installed individually through conda. On the Galaxy side this is ok as long as you explicitly send those tools to destinations where normal dependency resolution is disabled, but it breaks on the Pulsar side unless you set dependency_resolution on that destination (in the Galaxy job conf) to local.

So there are some workarounds if you're aware what's going on, but... I am not a big fan of per-job environments and would be happy to see support for them removed. I don't know if people have had success with them, but basically my issues with them are:

Incredibly slow env creation on a per-job basis, there is essentially no caching or reuse.
You are exporting the list of packages and specific versions in each of the deps that are installed to install them in the per-job environment, but those deps at those versions may no longer be installable.
If you're lucky, things "appear to work" to the user/admin until one day the requirements can no longer be solved or downloaded, at which point the tool is broken and very difficult to fix.

IMO we are much better off just dropping them entirely. If you have auto_install true, it should attempt to install a mulled env. If not, it should fail resolution and (potentially) fail the entire tool.

Galaxy Version and/or server at which you observed the bug
all

To Reproduce
Steps to reproduce the behavior:

Install a tool with multiple requirements with a conda resolver enabled, but do not install the requirements.
Run the tool, notice that nothing resolves but the tool executes (and fails if things needed aren't on $PATH).
Install one, but not all (mulled) of the dependencies.
Run the tool, notice that a per-job environment is created.

Expected behavior
If nothing else, auto_install is respected or a new option can be added to control this behavior.

The text was updated successfully, but these errors were encountered:

mvdbeek · 2022-04-12T14:28:53Z

IMO we are much better off just dropping them entirely. If you have auto_install true, it should attempt to install a mulled env. If not, it should fail resolution and (potentially) fail the entire tool.

👍 I think that's reasonable

hexylena · 2023-03-16T19:16:52Z

Oh, thanks for the ping @natefoo yeah this just bit me. It was very confusing.

…tainers instead, due to galaxyproject/galaxy#13711. Also, tag destinations that can mount CVMFS.

cat-bro · 2024-02-06T00:22:43Z

This has showed up on Galaxy Australia a few times, most recently with dbbuilder galaxyproteomics/tools-galaxyp#739. I can't see what the use case would be for per-job conda environments. I'd prefer the jobs to fail so that we can sort out an appropriate environment/container for the tool.

mvdbeek added kind/bug area/tool-dependencies area/backend labels Apr 12, 2022

natefoo added a commit to galaxyproject/usegalaxy-playbook that referenced this issue Apr 12, 2023

Disable remote dependency resolution on all HPC destinations, use con…

942e4e1

…tainers instead, due to galaxyproject/galaxy#13711. Also, tag destinations that can mount CVMFS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per-job conda environments are built regardless of auto_install whenever one or more deps exist but a mulled env does not exist for multi-requirement tools #13711

Per-job conda environments are built regardless of auto_install whenever one or more deps exist but a mulled env does not exist for multi-requirement tools #13711

natefoo commented Apr 12, 2022

mvdbeek commented Apr 12, 2022

hexylena commented Mar 16, 2023

cat-bro commented Feb 6, 2024

Per-job conda environments are built regardless of auto_install whenever one or more deps exist but a mulled env does not exist for multi-requirement tools #13711

Per-job conda environments are built regardless of auto_install whenever one or more deps exist but a mulled env does not exist for multi-requirement tools #13711

Comments

natefoo commented Apr 12, 2022

mvdbeek commented Apr 12, 2022

hexylena commented Mar 16, 2023

cat-bro commented Feb 6, 2024