Arjun Krishna Babu bio photo

Arjun Krishna Babu

Python. Machine Learning. Systems. Open source.

Email LinkedIn Github

As part of a machine learning course I’ve been taking on Coursera, I had to get some packages installed.

Since I couldn’t find a one-stop webpage covering all the instructions, I had to go back and forth multiple webpages. And then, after I’ve installed the whole thing, it took me a while to figure out how to run it.

And so, in this single post, I try to explain everything to you.

First up, I had to install the following packages:

  1. IPython Notebook
  2. GraphLab Create

GraphLab Create is not a free software, but they provide a 1-year, renewable license for educational purposes. You’ve to first go to their webpage and register yourself.

First up, go to the official instructions page and follow the instructions!

There are two options for installation:

  • Installation into Anaconda Python Environment (recommended)
  • Installation in Python environment using virtualenv

After following the official recommended path, you would have

  1. Installed Anaconda, pip, GraphLab Create, and IPython Notebook.
  2. Created a new Conda environment called gl-env.

In case you’re wondering (like I did), rest assured that the Anaconda installation will not clash with your existing Python installation (that ships with most Linux distributions).

On their website there is an option to upgrade to a version that uses GPU acceleration. I haven’t tried that myself, but feel free to try it if you have a compatible GPU card.

Starting IPython Notebook to use GraphLab

The proper procedure for firing up the whole thing (in Linux) is:

  1. Open the terminal.
  2. cd to the directory where your IPython Notebooks are.
    Strictly speaking, this step is optional; but this is what you want to do in most cases.
  3. Activate the gl-env Conda environment which you created earlier (see below for a brief into to Conda).
    $ source activate gl-env
  4. Start your IPython Notebook
    $ ipython notebook

And there you go! You’re all set!

Step 3 above is where everybody gets it wrong; they simply skip this step! Although IPython Notebook would start up fine, if you skip step 3, python will choke at you when you try to import the graphlab package:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-4b66ad388e97> in <module>()
----> 1 import graphlab

ImportError: No module named graphlab

This is because, if you’ve followed the official instructions, only the gl-env environment would have the graphlab package installed.

Brief Introduction to Conda

Conda, in simple terms, is a tool that allows you to simultaneously have multiple installations of Python on your computer without messing up the different installations. ie., you could create different “environments” of Python, each with different packages.

Depending on your needs, you can set up the different “sandboxed” environments with different packages installed in them; even different versions of python itself! And you can easily switch between the environments. A prime advantage to working this way is that you don’t have to touch the native python installation on your OS (if it has one).

To learn more about using Conda, check out the official documentations:

  1. Conda test drive.
  2. Managing environments in Conda.

Trust me, Conda makes your life so much easier.

Hope you’ve found this post helpful.

Useful links: