Note: if you’re not familiar with my writing, I should tell you that it often represents a bizarre stream of consciousness where I hop from stepping stone to stepping stone in a most unpredictable fashion and the other side of the stream bears little resemblance to the original bank. You’ve been warned.
About a year ago, a friend took me to see the film What the Bleep Do We Know. I’m afraid we annoyed many others watching the film because we couldn’t stop laughing at how ludicrous much of it was. For example, the film gave extensive coverage to “Dr.” Emoto insulting water and claiming this makes ice crystals ugly. He’s making a lot of money from books but, curiously, scientists have had trouble replicating his results.
This got me to thinking about the excellent book AI Application Programming by M. Tim Jones. I had been busy reworking his example of evolutionary behavior in synthetic ecosystems and my “animals” were not evolving as expected. After much mucking about in my code, I realized that the “eyes” worked, the “brains” worked, but I had accidentally severed the “nerves” between the eyes and the brain. Much like the audience at the aforementioned awful movie, my creatures could think and could see, but they could’t think about what they saw.
I suppose I could dismiss the other audience members as stupid, but that’s not true. Many of my very intelligent friends have been quite impressed with that movie and this suggests an interesting question. But first, let’s start a fight. Take several artificial intelligence researchers at random, put them in a room without food and water and tell them they can’t come out until they have a conclusive definition of artificial intelligence that they can agree on. After there is only one researcher left alive, all you’ll have left is an unsatisfactory definition and an annoyed researcher.
The problem, and the question I alluded to, is that we can’t define what “artificial” intelligence is until we can define what “intelligence” is. The latter raises annoying questions about consciousness, the “soul”, and other ideas which not only make many squirm, but may ultimately prove to be unanswerable. The computer scientist Jaron Lanier has posed disturbing thought experiments about the nature of intelligence and rejects the idea of machine intelligence. Given that I’m not a computer scientist, though, I take a utilitarian approach. If it looks like a duck, quacks like a duck, duck al’orange is on the menu.
In other words, I’m not as concerned with whether or not I can answer the unanswerable question of intelligence — I’m quite comfortable saying “I don’t know” — so much as I am concerned with taking advantage of the flexibility of AI systems. This flexibility, not surprisingly, is a flexibility we often associate with intelligence.
So your boss comes to you with a problem. You need to project sales for your lemonade stands. As it turns out, for a well-run lemonade stand, sales projections can be very reliable, but only if we understand the variables involved. You do your research and notice two things:
- Sales go up on hot days
- Sales go down on week days
After a bit of playing around, you come up with the following:
sales = (temp/100) * 500 sales = sales * 1.5 if weekend
For that, it says for 75 degrees (fahrenheit, duh), sales will be 375, except on weekends where sales should be 562.5. Of course, this isn’t exact and you understand there will be some fluctuation, but such projections allow you to predict how much labor you can spend, how much product you will order, and so on.
So you start projecting sales at a second lemonade stand and discover that your sales projection model fails miserably. A bit of research reveals that the first stand is in a park and the second stand is in an air-conditioned lobby. For the second, you decide that the temperature is always 65 degrees, but your system still fails. Now you have regular customers and a different traffic flow. Further, you may have such bad sales in the lobby on the weekend (most offices are closed) that you can’t profitably run the stand. After reanalyzing sales, you discover that people still buy more lemonade on hot days, but there’s still not as much of an impact, so your projection is a simple:
sales = 0 if ("outdoors" == location) sales = (temp/100) * 500 sales = sales * 1.5 if weekend else sales = temp * 1.2 + 600 if weekday
That’s beginning to look ugly, but it works. For a while. (We’ll call this the “brute force” approach). After a bit, you notice sales steadily rising at the lobby but not at the park. Why? Because you have regular customers. So maybe you should consider how long a stand has been open. Further, regular customers often like regular employees, so the longer an employee has worked there, the better they pull in regular customers. You also discover that female employees in the park pull in better sales on hot days than males employees because they’re wearing less clothing (sexist, perhaps, but I’ve seen it happen at the coffee carts I used to run). On top of that, you want to open new locations. How can you possibly factor all of these things in? Code which follows the above approach of hard-coding everything on a “per-location” basis has the advantage of making assumptions explicit but it has the disadvantage of not being flexible or maintainable.
To make matters even worse, there are often hidden correlations which you may not recognize. At the lemonade stand in the lobby, I mentioned that newer employees pull in less revenue than experienced employees. However, this effect is less noticeable for personable female employees. This is sad, but true. The gender of an employee, the length of time they’ve worked and the location they work at can all have subtle interactions which can affect sales. As you have more locations, it’s not reasonable to assume that the programmer can learn all these variables and properly account for how they will interact.
This exposes two problems:
- Incomplete information
- Unknown relationships
In the real world, however, the more complicated the decision, the more likely it is that these problems will occur. This is the norm, but our programming habits assume otherwise. We get division by zero errors or the program dies when the employee number is not supplied but we know our lemonade stand will make money even if we don’t know what the temperature is for a given day.
That’s where artificial intelligence can step in. One way of dealing with problems like this is to use a neural network. One of the most common types of such networks is a “feed forward, back error propogation” network, often known simply as a backprop network. In such a network, you have “neurons” connected by “synapses”. You feed in all your variables and compare the predicted and actual results. Then you walk backwards through the network and adjust the weights of the synapses to correct for the error. With enough data and training, you can start to get reliable answers (this is tougher than it sounds).
Now consider how intelligent you are. If you know your business well enough, you can often start to make reasonable sales predictions even if you don’t know which employees are working on a given day. You might not know the weather forecast or how many people are going to in the park on a given day but you can still make some sales projection. Unlike the “brute force” approach, you can often make reasonable projections in the absence of data. What’s astonishing about neural networks is that they can do the same thing. Don’t know the temperature? The neural network will still give an answer and if the other variables supplied (such as the date) have a heavy bearing on the results, you can still get a pretty darned good answer.
Neural networks can sometimes seem spooky in their performance. When designed properly, they learn well and they can detect relationships in data that trained observers will often miss. In fact, there is even software which can examine the internal structure of a neural network to help people learn what those relationships are. If you’re trying to predict movie revenue and you don’t know that that G-rated movies play better in small towns, there’s a good chance your network will find that if it has the movie rating, locations, and population information available.
It would be easy to dismiss neural networks as “not being intelligent” and in my gut, I suppose they’re not. However, since I cheerfully questioned the intelligence of my fellow movie-goers at “What the Bleep Do We Know”, I’m not sure I should be so quick to jump to conclusions. The number of neurons in our own brain are many orders of magnitude larger than the largest neural network ever built. What would happen if we could ever design neural nets that large? Perhaps some day in the future a robot with a sufficiently advanced neural network will clap happily at that movie and scowl at me and my friend laughing. I’d probably deserve it.