Thursday, September 24, 2015

Using a system installed conda to manage python environments

Using a system installed conda

Motivation

I want to provide a common scipy stack across platforms, and possibly other python environments.   Anaconda provides binary packages that can be installed into a separate environment.  However, it is normally geared to be installed and managed by a user, while I want to be able to centrally manage the configurations.  With 4.1.6, I can basically use the conda upstream as is, with a couple minor modifications.  I have a PR filed here with my changes.

Installation

At the root is the conda package manager.  I want to be able to install this from rpms, so I created a conda COPR.  This provides common conda and conda-env rpms for Fedora and EPEL7.  I've made conda-activate optional as it installs /usr/bin/{activate,deactivate} which are very generic names, alhough it makes it much simpler to load the environments.

Configuration

The system conda reads /usr/.condarc as its config file.  This is an unfortunate location, but that's what the current code looks for.  I'd like to change this to /etc/condarc in the future.  The COPR conda package has as default:
envs_dirs:
 - /opt/anaconda/envs
 - ~/conda/envs
pkgs_dirs:
 - /var/cache/conda/pkgs

So we:
  • Point to our local envs, install them in /opt/anaconda/envs.
  • Put packages into /var/cache.  This requires a patch to conda - https://github.com/conda/conda/pull/1637

Locally, I also set channels to point to our InstantMirror cache by setting "channels" and "channel_alias".

Ansible


Configure everything and install the basic scipy env in ansible:

- name: Configure conda repo
  template: src=orion-conda.repo.j2 dest=/etc/yum.repos.d/orion-conda.repo

- name: Install conda
  package: name={{ item }} state=present
  with_items:
   - conda
   - conda-activate 
   - conda-env

- name: Configure conda
  copy: src=condarc dest=/usr/.condarc 
 
Then I have a conda_env.yml task to install and manage the environments:
 
- stat: path=/opt/anaconda/envs/{{ env_name }}
  register: conda_env_dir

- name: Create conda {{ env_name }} env
  command: /usr/bin/conda create -y -n {{ env_name }} {{ env_req }} {{ env_pkgs | join(" ") }}
  when: not (conda_env_dir.stat.isdir is defined and conda_env_dir.stat.isdir)

- name: Update conda {{ env_name }} env
  conda: name={{ item }} extra_args="-n {{ env_name }}" state=latest
  with_items: "{{ env_pkgs }}"
  when: conda_env_dir.stat.isdir is defined and conda_env_dir.stat.isdir

- name: Install conda/{{ env_name }} module
  copy: src=modulefile dest=/etc/modulefiles/conda/{{ env_name }}

Which is called like:
 
- include: conda_env.yml env_name={{ conda_env }} env_req={{ conda_envs[conda_env] }} env_pkgs={{ conda_pkgs }}
  with_items: "{{ conda_envs }}"
  loop_control:
    loop_var: conda_env
  tags:
  - conda

With defaults/main.yml defining the environments:

conda_envs:
  scipy: python=2
  scipy3: python=3
  scipy34: python=3.4

conda_pkgs:
- astropy
- basemap
- ipython-notebook
- jupyter
- matplotlib
- netcdf4
- pandas
- scikit-learn
- scipy
- seaborn

This uses the ansible conda module to manage the creeted conda environments.

No comments:

Post a Comment