Home > Software engineering >  Is it safe to declare pip dependencies in a conda environment yml
Is it safe to declare pip dependencies in a conda environment yml

Time:03-15

According to this blog post from Anaconda, it's not safe to use packages from both pip and conda simultaneously.

Unfortunately, issues can arise when conda and pip are used together to create an environment [...] One surefire method is to only use conda packages.

However conda introduced a feature where pip dependencies can be specified in an environment YAML file.

Is this feature safe to use? Can it lead to broken environments? What should one do to ensure this doesn't happen?

CodePudding user response:

I don't think it's quite fair to it's not safe to use them in combination, people do it all the time. It's just that you have to be careful. The blog post is a little outdated (in my opinion due for a refresh) but some of the important details have not changed. As long as you install and update from a conda environment file and prefer installing packages via conda and not pip where possible, you should be fine. Or at least this is what I've been doing for a year across several projects. The problems I run into with setup are always due to other issues in the ecosystem, not the pip: feature of conda environment files.

Is this feature safe to use?

Yes. I'm sure there are ways to botch it, and I'm not quite sure how you'd defined "safe" here, but a configuration like below is used commonly in production.

name: my-dev-env
channels:
  - conda-forge
dependencies:
  - python
  - ...
  - some_other_conda_package
  - pip:
    - some_pypi_package

This does more or less what it would do without the pip: section, but calls pip and the end to install some_pypi_package.

Can it lead to broken environments?

Yes, but you'd have to try to do so. This bit

Most of these issues stem from that fact that conda, like other package managers, has limited abilities to control packages it did not install. Running conda after pip has the potential to overwrite and potentially break packages installed via pip. Similarly, pip may upgrade or remove a package which a conda-installed package requires. In some cases these breakages are cosmetic, where a few files are present that should have been removed, but in other cases the environment may evolve into an unusable state.

still holds true, and it tends to happens if after setting up an environment, conda install xyz and pip install xyz are called interchangeably and haphazardly. This is not the same thing as, and much more dangerous than, the pip: section in your environment YAML.

What should one do to ensure this doesn't happen?

In general, if you need to pull down something from PyPI but keep it safely managed in your conda environment:

  • Avoid manual conda install and pip install calls. Instead, update from your environment file, i.e. conda env update --file env.yaml. This will update your conda packages and re-install whatever you need from PyPI in a (relatively) safe way. The fact that you're asking about installing via pip from a conda environment file means you're already on the right track.
  • Only install from pip if something is only available on PyPI or for another reason can only be installed via pip. For example I use it for its git feature to install development builds before conda packages are made - though I wouldn't recommend doing this much.
  • Try moving some packages to conda - the blog post mentions this a few times. conda-forge also particular tries to make this easy.
  • Related