Why go with Conda for Machine Learning Projects

In programming, different languages have different package managers such as the following:

NodeJS: npm, pnpm, yarn

Ruby: RubyGems

PHP: Composer

Python: Anaconda (conda), Virtualenv (venv), pyenv

These tools are mandatory to understand and use for your programming projects. They help to maintain code packages in your local server environment. When conflicting package version numbers cause errors, package managers are there to isolate them. It simplifies complex tasks in managing packages, dependencies, and environments. The benefits for locking in package version numbers are compatibility, consistency and integrity. These allow you to ensure your code works across different servers, deployments, and testing. For example, a development team working on different parts of a codebase should ensure they have matching package version numbers to prevent errors.

If you’re new to Python, AI, or Machine Learning, you will notice multiple package managers such as venv, conda, and pyenv.

Choose conda because it was built with Data Science in mind supporting multiple languages. It may be replaced with another tool like Uvicorn but conda is still very popular. Conda supports non-python dependencies such as C++ libraries and cross-platform scientific applications. It also has a larger online community to help you build more and resolve issues.

The downside of conda is that it is larger when installing more dependencies than venv. It can also be problematic for web development projects because of the extra dependencies which may not be necessary. If you don’t plan on integrating AI or machine learning then stick with venv for the simplicity so you’ll get quicker builds in your pipelines and containerization.

To install conda, choose an installer from the official Aanaconda website or type:

brew install --cask anaconda

For clarification, conda is a package manager, and Anaconda is the distribution. It means distribution is a pre-configured collection of packages that can be installed. A package manager is a tool which allows the installation, removal, and updating of code packages.

To get you familiarized with conda, you should memorize these commands because you will need them for every project you work on.

10 Most used conda commands:

  1. conda create --name myenv
    Purpose: Create a new isolated environment named myenv.
    Example: conda create --name ml-env python=3.10
  2. conda activate myenv
    Purpose: Activate (switch to) a specific environment.
    Example: conda activate ml-env
  3. conda deactivate
    Purpose: Exit the current environment and return to the base environment.
  4. conda install package-name
    Purpose: Install a package in the active environment.
    Example: conda install numpy
  5. conda update package-name
    Purpose: Update a specific package to the latest compatible version.
    Example: conda update pandas
  6. conda list
    Purpose: Show all installed packages in the current environment.
  7. conda remove package-name
    Purpose: Uninstall a package from the current environment.
    Example: conda remove matplotlib
  8. conda env list
    Purpose: Display all environments you’ve created (with their paths).
  9. conda info
    Purpose: Display system information and details about your Conda setup.
  10. conda env export > environment.yml
    Purpose: Export your environment to a YAML file for sharing or backup.

I also don’t have anything against venv. If you prefer to choose it over conda, it is more lightweight and easier for smaller python-only projects.

The 6 most common venv commands:

  1. python -m venv myenv
    Purpose: Create a new virtual environment named myenv.
  2. source myenv/bin/activate (on macOS/Linux)
    Purpose: Activate the virtual environment so you can use it.
    Example (Linux/macOS): source venv/bin/activate
  3. deactivate
    Purpose: Exit the current virtual environment.
  4. pip install package-name
    Purpose: Install a package inside the active environment.
    Example: pip install requests
  5. pip list
    Purpose: Show all installed packages in the current environment.
  6. pip freeze
    Purpose: Output a list of installed packages and their versions (useful for saving dependencies).

Leave a Reply