Alternative method from the Asimov institute. This worked fine for me using an Anaconda install of Python on Mac OS X, except I had to remove the ‘–update’ flag when doing the installation with pip. Their version involves direct installation on OS X rather than installation on a virtual Linux machine via Docker, which is what I document below.
If you’re struggling to save your notebooks and retrieve them again, please see the update at the end of this post.
Google recently released a suite of Python tools for the technique du jour, deep learning, called Tensorflow. This was supplemented by a fantastic demo – which you can muck around with in your browser right now [screenshot above] – which inspired Zeb Kurth-Nelson and I to spend an evening setting up Tensorflow to work on my Mac. What follows is a brief account of our trials, tribulations, and solutions, in the hope that it is useful to anybody with a similar goal.
Some background
- Tensorflow is a set of packages for Python. The process of setting up Tensorflow comprises of setting up an environment (see below for why this is an ‘environment’) which allows you to run Python and use these packages.
- Tensorflow is made to run in Linux. Using in Mac is not a huge issue, since Mac OS X is built upon Unix, which also provides the foundation of Linux.
- In order to emulate Linux, you have to use an application like Docker (relevant: useful summary of why Docker is invaluable for data scientists and those pursuing open science, from Kaggle).
- Docker provides a virtual Linux machine, on your machine, for you to run Tensorflow in
- We started off following a rather charming tutorial titled Tensorflow for Poets, which appealed to our artistic sensibilities. However, this soon led us into a place of slithy toves, gyre and gimbling in the wabe. The mimsy borogoves caused us all sort of trouble – namely, problems with pulling the relevant Git files from TF’s Github account.
- What follows is a hacked together pipeline chiefly inspired by TF’s official setup page and sprinkled with wisdom from Stack Overflow.
Ok, how do I get Tensorflow working on my Mac?
(my Mac being an 2014 Air, running 10.11.4)
- Go to Docker and install an instance according to their Getting Started page
- Open the Docker QuickStart Terminal – easiest way to do this is using Spotlight, hitting Cmd+Spacebar, typing ‘Docker Q’ and hitting enter:
- This opens up an instance of Terminal (Mac OS X’s command line console), but running inside your virtual machine! Pretty neat eh? By way of confirmation, you should see a delightful ASCII whale:
- To check your installation is ticketyboo, type
docker run hello-world
This should execute a little helper program which confirms that everything is working well. If it doesn’t work, you probably have to check out the Docker FAQ‘s.
- Next, we’re going to install Tensorflow from the web. Type
docker run -it -p 8888:8888 b.gcr.io/tensorflow/tensorflow
This is going to launch plenty of humming and whirring, as various packages get downloaded. Once they’ve finished, you are (unexpectedly) booted into a Jupyter notebook, with an interface that looks like this:
What the hell is going on?
Jupyter is a Python notebook interface (the name of which you will have trouble persuading Mac OS X not to autocorrect). It allows you to run Python code from a web browser, with a nice interface that allows you to mingle runnable code with notes and outputs from that code:
BUT you won’t actually see this nice webpage yet, because we have a problem:
Jupyter works by Python routing information to your web browser. However, your Python is running on your virtual machine (which is a Linux machine, in Docker), and your web browser is on your actual machine (running Mac OS X). We thus need to do a bit of jiggling to allow your virtual machine to communicate with your physical machine’s web browser.
- The solution lies in Docker’s backend, an application called VirtualBox. Again, use Spotlight to hit Cmd + Spacebar, start typing ‘VirtualBox’, and hit Enter
- In VirtualBox you get a little summary of the different virtual machines you have installed via Docker. You will have one, called ‘Default’. Right click on this and select Settings.
- Now select Network on the resulting Settings pane. At the bottom is a button marked ‘Port Forwarding’. Hit this.
- This is the tab where we allow our virtual machine to connect to the outside world. Click the little green ‘Add a rule’ button, highlighted in the picture below, and then copy the details I’ve written in for the entry marked jupyter.
- We’re telling the computer to allow connections from the virtual machines port 8888 (the guest port) to my actual machines 8888 (host port)
- Hardly believing it might be this simple, go to your browser, and enter
localhost:8888
This sends your browser to port 8888 on your computer. And, lo and behold:
This is the interface for your Jupyter notebooks! Select one of them to get started (probably the one marked 1, hello_tensorflow), and you’ll see a real, live, Python notebook running Tensorflow:
As far as we’ve been able to figure out, this is a working instance. If you’re not familiar with Jupyter, you might want to check out their Get Started guide.
Helpful note from Julie Lee: If you don’t get the above notebook interface, it might be because your Jupyter notebooks aren’t being routed via port 8888. Try exiting your current Tensorflow instantiation (hit Ctrl-C twice) and running with the command
docker run -it -p 8888:8888 b.gcr.io/tensorflow/tensorflow
Which should force TF to use port 8888 (thanks Julie!).
UPDATE 26/5/16:
The method posted below works, but it’s a bit clunky. It turns out that you can use the Jupyter upload functionality (top right hand corner of the Jupyter notebook interface) to get files into your TF installation, and then just download notebooks for sharing (within a notebook, go to File –> Download As). Much easier!
The bit about starting TF in a named container, and then reattaching that one rather than starting a new one for each session still holds though. So the command
docker run -it -d -p 8888:8888 –name NAMEOFCONTAINER b.gcr.io/tensorflow/tensorflow
Ought to do the job for the first setup, and then:
docker start NAMEOFCONTAINER
docker attach NAMEOFCONTAINER
To re-access the container on subsequent sessions.
One problem we’ve found with the above instantiation: every time you start Tensorflow, you create an instance of Docker from scratch. This means that you lose any changes you made, for instance to the tutorial scripts. The reason for this is complicated, and basically relates to how Docker works, of which I have an imperfect understanding: but it involves these things called containers. The steps outlined above will create a new container every time you run it- not ideal. Luckily, Zeb has found a workaround:
- Create a local folder where you’d like to store files connected to Tensorflow
- Run this command with NAMEOFCONTAINER and PATH_OF_YOUR_FOLDER replaced by a name of your choice, and a path to the folder you created in step 1 :
docker run -it -d -p 8888:8888 –name NAMEOFCONTAINER -v PATH_OF_YOUR_FOLDER:/localfiles b.gcr.io/tensorflow/tensorflow
So for me, it was :
docker run -it -d -p 8888:8888 –name archyTF2 -v /Users/archy/tensorflow:/localfiles b.gcr.io/tensorflow/tensorflow
3. This will do 3 things:
a) It starts your instance of Tensorflow running in a named container. The next time we want to run TF, we’re just going to relaunch this container.
b) It links a folder on your hard drive (the one indicated by PATH_OF_YOUR_FOLDER) to a folder in the Docker installation…
c) … in this case called local files. I think it also creates this folder (but I can’t quite remember – you might have to make it yourself using mkdir).
Now: if you head to localhost:8888, you see the iPython notebook again:
Now we’re going to create a new document, but it won’t be a notebook, it’ll be a terminal:
Once inside, we can use a unix command (pwd) to figure out where precisely we are. It turns out we are in a folder called notebooks.
Use cd .. to get back into the root folder, and ls to display the contents.
Hopefully there’ll be a folder called locafiles. If everything has gone to plan, this folder will mirror the contents of the folder that you indicated with PATH_OF_YOUR_FOLDER above. You can see that in my case this has worked, and a folder I’ve created called test file_isithere?.txt is indeed present.
Now create a new file, using the command touch followed by the name of a file (here called madeinsidetensorflow.txt).
Hopefully, and happily in this case, it should now be present in the local folder you’ve setup as the path:
So now we can save stuff that we do in tensor flow to local storage (i.e. into your normal file system). Note that the next time we start TF, we don’t run the whole command again- instead we do
docker start NAMEOFCONTAINER
docker attach NAMEOFCONTAINER
And bob’s your uncle, we’ve rebooted the container we had last time, along with its nifty local file storage solution.