I remember when I was first inspired to build a dedicated deep learning box.
I had just stumbled across Lukas Biewald’s post from the O’Reilly AI newsletter on how to “build a super fast deep learning machine for under $1000.” The thought of building a dedicated machine hadn’t even occurred to me. At the time, I didn’t know how to use Tensorflow and couldn’t properly explain to you the mathematics of backprop. But, $1000 seemed like a reasonable and reachable budget for experimentation.
Shortly after, Jeremy Howard and Rachel Thomas of Fast.ai took a chance on me and offered me a spot in their Deep Learning Part II course.
This was a massive leap of faith on their part. The prerequisites for the course were Deep Learning Part I – which covered common CNN architectures like VGG, Inception, and Resnet, as well as word embeddings, RNNs, and basic NLP tasks – on top of a minimum of one year working in a coding-based position.
As an experience designer and product strategist, my professional work involves empathizing with users who are frustrated with technology and conducting IDEO-style “design jams” to make them less frustrated. My recent coding experience was limited to tweaking WordPress themes, hobbling together an iOS app that barely eeked past Apple quality control, and shipping crappy Alexa Skills that inflict corny science jokes on unsuspecting Amazon Echo owners.
Oh, and I knew jack squat about deep learning.
Given this (lack of) background, you might understand why taking a “hardcore AI” class intimidated me. I desperately marathoned through all the Deep Learning Part I MOOC videos in a weekend and watched an inordinate number of Khan Academy lessons to remind myself of pesky math principles I’d long forgotten.
But if I can do it, you can do it too. Even before class started, Jeremy encouraged students “who had gotten this far in deep learning” to build their own servers and avoid forking over hundreds of dollars each month to AWS for their slow-ass P2 instances.
Thus began my epic journey to build Deep Confusion, a box which I named after my typical mental experience when I try to understand anything Geoffrey Hinton says.
Doing The Research
Luckily, many others have gone through the process and shared their wisdom in detailed articles. Here are the resources I based my choices on:
Lukas Biewald – Build a super fast deep learning machine for under $1,000
Tim Dettmers – A full hardware guide to deep learning
Roelof Pieters – Building a deep learning (dream) machine
Brendan Fortuner – Building your own deep learning box
Joseph Redmon – Hardware guide: neural networks on GPUs
My fellow Fast.ai students warned me that the hardest part is picking the parts. I didn’t believe them since there were plenty of detailed configuration lists online, but hardware moves so fast you’ll want to conduct your own research before making any commitments. NVIDIA’s 1080 Ti was announced soon after many of my compatriots already made their GPU choices, causing much buyer’s remorse.
Buying The Parts
So, for what it’s worth, here’s my final parts list. I started off with a single Titan X Pascal GPU gifted to me by NVIDIA (Thanks, Jen-Hsun!), but designed the computer to accommodate multiple GPUs in the future. By the time you read this, you’ll likely be able to find superior hardware configurations, so don’t neglect your research!
GPU – NVIDIA GeForce Titan X Pascal
(Warning, this Batmobile of a consumer GPU is so monolithic that it blocks multiple PCIe lanes on my motherboard when installed. In theory, my mobo supports up to 4x GPUs. In reality, it can only accommodate a maximum of 2x Titan X Pascals)
CPU Cooler – Cooler Master Hyper 212 EVO
Motherboard – MSI X99A SLI Plus
Memory – Corsair Vengeance 16GB (2 x 8GB)
SSD – Samsung 850 EVO 2TB
(I decided to splurge on a larger SSD since I plan to do vision work with ImageNet and liked the idea of fitting the entire dataset on one fast drive. 05/01/2017 UPDATE: I started downloading the full ImageNet dataset (1.2 TB!) and got an angry email from AT&T saying that I violated my 1 TB / month internet usage limit. If you didn’t realize consumer internet plans come with caps, now you do.)
HD – WD Red 3TB
Power Supply – EVGA 1000GQ 80+ Gold
Case – Rosewill Thor 2 ATX Full Tower
(This roomy case is also great for VR setups. Soon I plan to buy a Vive and make Deep Confusion a box for both deep learning and deep forgetting)
Making Fun Of Branding
Let’s take a break here to appreciate how comically masculine the names of gaming components are. TITAN X! THOR 2! VENGEANCE! To top it off, the onboard wifi adapter for my motherboard is called KILLER. Why a wifi adapter needs to sound homicidal is beyond me.
Building The Computer
Manly branding aside, getting all your packages in the mail is a very exciting affair.
You may be tempted, as I was, to start putting all your components directly in your case, but many Amazon reviews complain of parts being dead on arrival, especially motherboards. Put all your parts together on a table, connect your peripherals, and pray that your motherboard lights up without making any scary beeping noises.
Patience helps. For about 15 minutes, my monitor wouldn’t display the BIOS despite the Titan X being powered on. Eventually figured out the problem was a power supply issue and fixed.
Once you’ve sanity checked that your parts play nicely together, you can confidently install everything into your case and zip tie all the loose cords together.
Installing The Software
Ironically, the hard part begins after your hardware is set up. NVIDIA drivers seem to cause problems for many others, so I followed my classmate Brendan’s recommendation to just install CUDA 8.0 Toolbox which comes pre-packaged with GPU drivers.
Slight variations in package versions and software compatibility can mean code that used to run perfectly on your AWS server will likely throw errors on your box. In particular, I had a mix of legacy Python 2.7 and Theano code from Deep Learning Part I along with Python 3.5, Tensorflow, and PyTorch code from Part II.
Here’s the installation order I followed:
Anaconda provides an easy way to spin up multiple environments. I set up one Theano + Python 2.7 environment and another Tensorflow + PyTorch + Python 3.5 environment. If you’re proceeding through the Fast.ai MOOCs for deep learning, definitely check the “Making your own server” forum thread for the exact UNIX commands other students used to install the requisite libraries. The posts there saved me a ton of time.
Setting Up Remote Access
If you’d like to access your newly born GPU workstation when you’re out and about, you’ll need to set up secure remote access. Sravya Tirukkovalur, another Fast.ai classmate, has an excellent tutorial on how to set up secure SSH access keys so your security isn’t entirely dependent on your ability to generate and memorize complicated passwords.
Brendan also suggests using a remote screen sharing software like TeamViewer. The software installed and worked seamlessly on both my Macbook and the Ubuntu workstation. Having access to GUI manipulation definitely helps for executing tasks that would be onerous via command line.
Sharing The Experience
Voila! I managed to get my server (mostly) set up in a single day because of all the great resources others have laboriously compiled and shared with the community. The shared wisdom helped me map out known pitfalls in advance and minimize the need to Google endlessly and stalk AskUbuntu forums for answers.
Without Jeremy Howard and Rachel Thomas of Fast.ai, I’d be stuck learning about neural networks on YouTube. Without the timely help of the community, in particular hardware guru Frank Sharp, I might have fried my parts by accident during assembly.
So, share what you learn, even if you’re no expert. Your knowledge might prove essential for someone following right behind you, learning what you know now.
I hope this article is useful and entertaining for you as you embark on your own (mis)adventures in deep learning.