Skip to content

Alternatives to updating conda environment on Job Execution #1752

@umesh-timalsina

Description

@umesh-timalsina

Currently, on Job Execution(If dependencies are specified), we clone the base environment which is just a bunch of copy and move operations and the next step is to update the cloned environment with the new dependencies. Primitive inspection (via console.time) shows that it takes the longest.

One alternative would be check environment file as follows:

  1. Check the python version (to be python 3.7)
  2. Check if only pip installed dependencies available

If the case above, we could just install the dependencies using pip.

The following benchmarks show that installing numpy and pandas using pip is significantly faster than waiting for conda to resolve the environment.

dependencies:
  - pip:
    - numpy

Example:

(base) umesh@isisdell:~$ time conda run -n deepforge-copy pip install numpy pandas
real	0m4.185s
user	0m2.503s
sys	0m0.348s
(base) umesh@isisdell:~$ time conda run -n deepforge-copy pip uninstall numpy --yes
real	0m0.802s
user	0m0.677s
sys	0m0.125s
(base) umesh@isisdell:~$ time conda env update -n deepforge-copy  --file update-file.yml
real	0m19.691s
user	0m17.325s
sys	0m1.232s

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions