My Dive into Deep Learning

(Over-)Simplified view of a deep learning architecture for classifying documents

I spent last week in a Deep Learning workshop and I came away both encouraged and disheartened at the same time. (For the uninitiated, Deep Learning has no formal definition; however, I would loosely define it as a new name given to the class of machine learning techniques that combine many nonlinear functions in order to approximately model abstractions of data.) What impressed me during the week was the maturity of the tools and how quickly one can harness the power of the techniques–neural networks, in particular. On the other hand, given how much the term is bandied about, I was shocked by how little is truly understood about these techniques. Nevertheless, the analytics community has enough understanding for businesses to start benefitting right now.

80% Classification Accuracy in 10 Minutes

During the workshop, we built a neural network that in under 10 minutes could “learn” to identify the subject of random Reuters articles with better than 80% accuracy. Before you get too impressed, I will admit that the neural network and data manipulation for this task is included as an example in the Keras package (examples/reuters_mlp.py).

Several tools were described, but we all seemed to flock toward Keras for building and manipulating our neural networks. Written in python, it is a fast-prototyping library that sits on top of another python library called Theano. We didn’t touch the Theano part because Keras seemed to have everything we needed.

The Keras Reuters example is representative of how far the technology has come. On Friday night, I decided to give it a go at home and in one evening I built a virtual machine (VM) running linux and gave it an artificially intelligent brain. The 10 minutes I mentioned above was on a little VM I built on my iMac. Using a GPU as we did in the workshop, it took about 50 seconds!

[Installing Keras wasn’t as easy as I hoped it would be, but it was doable once I worked out all the dependencies. After I get the kinks out, I’ll try to remember to post some instructions.]

Deep Learning? Yes. Deep Understanding? Kind of.

When I was working on my doctorate, our general thinking was that neural networks were taboo (no pun intended for those who got the reference) because they were so much like a black box. I have a pretty good understanding of neural networks and how they work. I could even code up the algorithm from scratch if I needed to, but the workshop confirmed that they are still black boxes to everyone.

Despite the fact that dozens of people and teams are writing theoretical and practical deep learning papers and developing competing code bases for deep learning algorithms, there is still a black art around how to properly set the so-called hyper-parameters for these various techniques. For example, nobody really understands when an extra hidden layer of “neurons” will help or hinder your system’s ability to learn. Neither does anyone know if 512 neurons is necessarily better than just 500 simply because it is a power of two.

No Need to Fear Deep Learning … yet

If you listen to the headlines, machines are poised to take over the world. In 1997, IBM’s Deep Blue computer beat Garry Kasparov, the reigning world chess champion, in a 6-game match. Then in 2011, IBM’s Watson computer beat two humans in Jeopardy. Now, Google’s DeepMind computer (technology Google acquired and built from in 2014) has learned to play Atari video games. Last week, Chinese Tech Giant Baidu’s Minwa supercomputer reportedly surpassed humans (and all other computer competitors) in recognizing images.

Coincidentally, I watched Avengers: Age of Ultron twice the weekend before the workshop. Rest assured, though, that we’re still very far from evil sentient machines trying to obliterate mankind. In fact, in a February 2015 interview, Andrew Ng, Stanford professor, Coursera founder, Google Brain founder, and chief scientist at Baidu, said, “I don’t see any realistic path from the stuff we work on today—which is amazing and creating tons of value—but I don’t see any path for the software we write to turn evil” (emphasis added).

Beyond Games: Deep Learning for Business

For sure, Deep Learning comprises a powerful set of tools and the set is growing. In addition to playing games and improving search engine results, machine learning can help traditional businesses. Businesses can use simple computers to predict equipment failures in order to reduce maintenance costs and increase system reliability. Retailers can train computers to identify optimal pricing strategies. Marketers can use machine learning to predict how consumers will respond to different types of advertising. And similar approaches can be used to make sense of survey results from, for example, customer feedback.

During last week’s workshop, after the Reuters example, I built a neural network to classify a different set of documents and followed the classifier with another technique to cluster and visualize the documents. To an extent, I think some machine learning and artificial intelligence will always require a bit of magic and require forays into the unknown, but it is clear that we have the tools and understanding to start putting what we know to use.

The Thoughts of W Christian

What could you be doing better?