But first, a brief history of this steady retreat:
1600: Some machines are stronger than humans, but no machine can perform numerical computations.
1623: Wilhelm Schickard builds the first automatic calculator.
1900: Machines perform many mathematical operations, but can’t perform complex series of computations where the outcome of one computation determines what computation to perform next.
1941: Machines perform complex computations under the control of algorithms developed by humans.
1950: Computers handle many “boring” tasks, but can’t do anything as “obviously intelligent” as playing chess.
1962: The appearance of the first machines capable of playing chess at an acceptable level.
1997: Human world chess champion, Garry Kasparov, loses a chess match to the Deep Blue computer software.
2007: Computers are better at chess than just about any human. Since chess has long been respected as a game requiring the highest human wisdom, by what right can we claim to be more intelligent than these machines?
It is interesting to note that there are still many things machines perform only at a very low level, if at all. They cannot be trusted to understand unconstrained “natural language” (the term which Computer Science assigned to what everybody else simply calls “language”, referring to languages used in human communications). We can’t ask machines to view a simple video, describe what happened in it and explain why. Robots can’t even perform the simplest household chores such as washing the dishes! All of these examples serve to show that we don’t sufficiently appreciate even the simplest tasks, those that we can all perform. This may be the single most important lesson that Artificial Intelligence has taught us over the past half-century: it is astoundingly difficult to perform everyday tasks because they require getting along in the real world, with all of its “messiness”, subtle gradations of attributes and meanings, uncertainties and complexities. The game of chess has none of these characteristics; the world of chess is well-defined, with rules that allow no shades of meaning. Your bishop is either on square A6 or on A7 – it can’t be on the line between them. Given that it’s on A6, there are some moves which are allowed, and some moves which aren’t – no moves are “OK, but frowned upon”, or “OK, unless we later uncover some new facts”.
Being Better at Those Tasks We Want to Be Better At
And yet, isn’t it somewhat humiliating to resort to household tasks when asserting our superior intelligence over machines? After all, at least some of the problems used in standard IQ tests are quite easy for computers. We won’t receive a high score at an IQ test simply because we have the ability to walk around a room and tidy it up, for example. This is because the designers of IQ tests, just like most other people, don’t regard the performance of such common chores – let alone walking without bumping into the furniture – as evidence of intelligence.
Some readers will probably feel that there’s no need to look so hard for evidence of our superiority, or to be troubled by what we’re taught by the evidence we’ve found so far. Even if this evidence indicates that machines are better at what we used to consider as the highest achievements of human intelligence, while we’re better at tasks which we would have preferred to hand over to machines, what of it? It’s self evident that we can think while they can’t, and anything that doesn’t think is obviously unintelligent. Isn’t it?
This glib answer is problematic: thinking is a process, and this process is largely intangible and immeasurable. Let’s focus on end-products of the process of thinking – whether these are the solution of a mathematics problem, moving a piece in chess, or navigating around the table to reach the armchair. What end-products can we highlight as being both uniquely human and highly impressive?
A better answer would be the following: Humans designed and created machines, whereas machines can’t design and create humans. More generally, we could claim that computers will never be creative. At last, we have found the citadel that no amount of the computers’ brute-force computing ability could challenge. Or have we?
Detecting Creativity in Art
It’s hard to imagine how art, poetry, science, technology, advertising (or many other human activities) could exist without creativity. And yet, it is notoriously difficult to define creativity, measure it, and reach an agreement on whether it has indeed occurred. How can we distinguish a new creation from an imitation of earlier creation, or from an incoherent jumble of odds and ends that might be new but still can’t be considered creative?
Some critics may praise a work of art as ground-breaking while other critics dismiss it as repetitive or meaningless. Paintings by some apes and elephants have sparked controversy when a few highly-respected critics accepted them as genuine art. In the fifties, Picasso and Miro appreciated paintings by a chimpanzee named Congo so much that they acquired some of these paintings. Recently, three of Congo’s paintings were sold for £14,000. Computer-generated paintings have also been judged by some critics as “art”.
Computer-written poetry has been around from the early days of computers. Initially, computers strung together randomly-chosen nouns, verbs, and adjectives, in a grammatically-acceptable sequence. The resulting text seemed surreal, but was sometimes oddly meaningful and even moving. Raymond Kurzweil, the well-known inventor, futurist and AI pioneer, went further in creating his “cybernetic poet”. This software reads poetry supplied to it, and uses it to generate a “language model” which, among other things, describes the selection of words, the meters used, and the text structures. This model captures some of the elements of the “style” of the poet whose poetry is used as input. If poems by several different poets are used, a new style emerges, which is a hybrid the original poets’ styles.
Kurzweil reports the results of an experiment in which he supplied 16 people, who played the role of judges, with 28 stanzas of poetry, ranging in length from one to twelve lines. Some of the stanzas were written by human authors (three famous poets and one obscure poet – Raymond Kurzweil). Other stanzas were created by the software, according to “language models” that it had developed based on works by the same poets. When asked whether each stanza was written by a human or by a computer, the judges were right about 60% of the time. While this is better than a random “50-50” decision, it does not demonstrate an easily detectable difference between computer-written poetry and human poetry. Kurzweil makes it clear that the judges’ low accuracy has to do with the use of fragments instead of complete poems: “A more difficult problem than writing stanzas of poetry is writing complete poems that make thematic, syntactic, and poetic sense across multiple stanzas. A future version of the Kurzweil Cybernetic Poet is contemplated that attempts this more difficult task. To be successful, the models created by the Cybernetic Poet will require a richer understanding of the syntactic and poetic function of each word”.
Another problem with the Cybernetic Poet is that its “art” can be considered as wholly derivative, resulting from imitation instead of true creativity. Although this is true without a doubt, it is also true that no human poet writes in a vacuum. A poet is always influenced by earlier poets, by literature, by music, and even by the evening news. Perhaps, when we mix enough poetry and other input into the “poetry machine”, it will become difficult to determine which poet it is imitating. Would we then consider this mechanical poet as having acquired a voice and a personality of its own?
It turns out that the judges’ accuracy was not affected by their level of familiarity with artificial intelligence, nor was it dependent on their familiarity with poetry. Would it be possible to write software that would classify poetry as human-originated or machine-originated, surpassing human performance on this task? If so, one possible conclusion is that critical appreciation (at least in the narrow sense of identifying a poem’s origin) is more machine-like than we ever suspected – though some authors will say they knew it all along…
Poetry is not the only art form machines have dared to invade. Since early in computer history, computers have been creating music and paintings. In music, this might be related to some experiments by early-twentieth-century composers who used random numbers as input for the composition process. In literature, a complete novel was automatically written by a computer, in the style of Jacqueline Susann. According to the programmer, this was made possible by utilizing the formulaic character of Susann’s authentic works. In all of these fields, there were also works which were jointly authored by humans and machines. The human artist sets parameters for the software’s work and selects the most promising results out of the machine’s output. One example for a joint human-machine venture is the Kandid project, which uses the human artist’s selections as a guide line in the evolution of the art generated by a computer. Kurzweil’s Cybernetic Poet can also be used this way. Joint projects of this kind blur the boundaries: if we agree that a product of such cooperation is indeed a work of art, then whom should we designate as the artist? The human user, who guided the creation? The computer software that conducted the “hands-on” work? The programmer who created the software?
Creativity in Math and Science
In art, it is impossible to define an objective test for creativity and innovation. In math and science, it’s much easier. A new invention may, in many cases, be subject to patent examination, which has been accepted as an objective (though fallible) test for its “newness”. A new scientific paper is subject to anonymous peer review, so that those papers which add to human knowledge are selected. If a new mathematical proof has been found, especially if it’s in an area where mathematicians have previously focused their efforts, it’s hard to dismiss it as boring or non-innovative. Yet this is exactly what happened when Marvin Minsky started working on automatic mathematical proof, back in 1956. Minsky designed a way to represent the basics of plane geometry so that they could be handled by computers – first using manual simulation, due to the technological limitations of those times, and later using real computers. The first success came much earlier than expected, when Minsky set the computer the task of proving the fifth proposition (the first “real theorem”) in Euclid’s first volume of “Elements”, the founding work on plane geometry written around 300 BC. This proposition states that the angles at the base of an isosceles triangle are equal. The proof found by the computer (see appendix) was much shorter than the proofs traditionally taught in schools, though it is arguably more difficult to grasp. When Minsky showed this proof to expert mathematicians, he was told that the computer was not the first to discover the proof, as it was already published by Pappus of Alexandria in the 4th century AD. Still, as this proof was not known to Minsky, who else could be its author if not the computer? To phrase this question differently: if a high-school student had come up independently with this proof as an alternative to the standard classroom proof, shouldn’t we consider this as evidence of mathematical skills and originality? If so, why deny the same recognition when the author is a computer?
Since Minsky’s groundbreaking work, it has become quite common to delegate some parts of theorem-proving to computers. The credit for any new “discovery” is rightly given to the human mathematician, who sets the problems and guides the software towards likely directions in search for the proof, but many sub-problems are solved independently by the computer. The most famous case of computer-aided proof is probably the proof of the four-color map theorem, which states that it is possible to color any map (more formally, any division of the plane into connected regions) using only four colors in such a way that no two regions with the same color share a border. This problem was first proposed in 1852. Though many mathematicians applied their efforts to finding a proof or discovering a counter-example, the proof was only produced in 1976 by Kenneth Appel and Wolfgang Haken at the University of Illinois. Their proof reduced the infinity of possible maps into about 1,500 cases, which were then checked by computer. Unlike Minsky’s work and later automated theorem proving machines, this checking itself did not involve anything we’d identify as steps in a formal proof. Instead, it amounted to examining the possible colorings for each of the 1,500 cases. Appel’s and Haken’s proof was an important contribution to mathematics and a pioneering step in the use of computers in mathematics. However, unlike Minsky’s work, it can’t be considered as a case where the computer played a central role in generating the central ideas of the proof. In some other areas of mathematics, which until a couple of decades ago required high skill, computers have largely displaced humans. One example for such a field in calculus is integration. Mathematical integration often requires applying several tools in the right sequence – substitution, integration by parts, rewriting parts of the formula, etc. In many cases, integration presented a serious intellectual challenge, but now, software has surpassed most humans on this task.
Computers have also come into their own in creating new inventions. One source of “automated invention” is the technique of genetic programming. One website devoted to this technique says: “There are now 36 instances where genetic programming has automatically produced a result that is competitive with human performance, including 15 instances where genetic programming has created an entity that either infringes or duplicates the functionality of a previously patented 20th-century invention, 6 instances where genetic programming has done the same with respect to a 21st-centry invention, and 2 instances where genetic programming has created a patentable new invention”.
The Machine behind the Curtain
In “The Wizard of Oz”, Dorothy and her loyal companions are very impressed with the magic they encounter upon finally entering the hall of the wonderful wizard. Then, they discover that it’s not magic at all – all the amazing effects are produced by machines operated by a little man hiding behind a curtain. The man – the wizard of Oz, of course – tries to keep up the pretense by shouting into his microphone “pay no attention to the man behind the curtain”. The evidence of creativity presented above indeed pays no attention to the machine behind the curtain.
And here’s the problem: when we do pull the curtain aside and examine the machine that has just now provided us with some creative art, math, or science, we don’t detect any component of “creative magic”. There isn’t any magic. There is no secret ingredient. There are only algorithms, data processing, and numbers (some of them random) being moved around. We have been cheated. There is nobody home – least of all somebody creative.
For example, Minsky commented regarding his automated geometrical proof engine: “What was interesting is that this was found after a very short search – because, after all, there weren’t many things to do. You might say the program was too stupid to do what a person might do, that is, think, ‘Oh, those are both the same triangle. Surely no good could come from giving it two different names’ ”. For larger and more complex geometric proofs, the software laboriously examines each pair of triangles in the problem, checking for each pair whether they can be proven to be congruent (e.g. if all sides are equal). If such a pair is found, the algorithm adds facts into its data base regarding the mathematical problem. For example, if triangles ABC and DEF are congruent, the fact that length of side BC is equal to length of side EF (and several other facts) is added. This may allow other pairs of triangles to be identified as congruent, and so on. When additional theorems are added to the algorithm’s data base so that the computer improves its “understanding” and looks for other things beyond congruent triangles, the algorithm’s power increases. Still, this is indeed an uninspired, uninspiring, brute-force process.
After Minsky’s work, such algorithms have been improved to guide their search efforts by using methods such as working backwards from the claim that needs to be proven, and by using heuristics – in effect, guidelines for what kinds of knowledge should be applied and in what order, in order to accelerate the search. Again, if we were to follow in great detail what the machine was doing at each moment while it was generating the proof that we found so impressive, we wouldn’t be impressed at all. Knowing how the magic is done, in this case as in many others, takes away the magic. Similarly, though the Cybernetic Poet’s algorithms are quite sophisticated, they boil down to a repetitive sequence of counting words, estimating frequencies of word combinations, and searching for patterns similar to the current fragment being written – not even a respectable caricature of human creativity, right?
Pay Attention to the Machine inside the Man
For some people, the fact that the creative spark vanishes when we examine the machine more closely serves as the conclusive proof that the spark was never really there. The opposite stance is the one taken by advocates of “strong AI”. In their eyes, if we could expose what happens in the mind while we’re being creative, we will again fail to find the creative core, the place and time where the magic happens. Search as we may, we’ll see nothing more than neurons firing, activating other neurons, and so on, ad infinitum. The better our research tools are, the more certain it becomes that the mind is made of mindless components, that the magic spark is not present in any single component but in the way the components work together. It is an emergent phenomenon, which, if we agree with the strong AI principle, can be duplicated by man-made machines. After all, this process already occurs in “natural machines” – ourselves.
AI researchers aren’t the only people trying to peer behind the curtains. Neurologists and psychologists have been using the power of devices such as fMRI (functional magnetic resonance imaging) to discover how the brain goes about the business of being a mind. Among many discoveries in this field, several relate to the “Eureka Moment” – the sudden “Aha!” sensation we get when the solution to a problem becomes obvious. One study found that during this moment, there was increased activity in the right hemisphere anterior superior temporal gyrus. Yet, it would be a mistake to assume that this small part of the brain, composed of neurons just like any other part, houses the “insight machine”. Instead, the researchers assume that this part “helps pull together distantly related information that might otherwise be missed”. This is a good description of insight expressed in mechanical terms.
Popular science author Steven Johnson reports an intriguing, but less scientifically-rigorous case, in his book “Mind Wide Open”: His brain was monitored on fMRI while he was planning how to express some ideas for a chapter in the book. This creative act was expressed in the brain by increased activity in only one location (except for the locations known to be involved when we’re thinking about written text): the medial frontal gyrus, whose functions include the “orchestration” of other parts of the brain. This may seem strange: shouldn’t we expect a creative process involving the cooperation of many parts of the brain, and resulting in the creation of something completely new, to require the creation of connections which were never considered before? Johnson offers a possible explanation: maybe the act of creativity requires a moment where all input reaches one location, while all else is quiet. These findings and many others will eventually help us understand how the brain generates wonderful ideas, which we perceive as evidence of the superior human mind and creativity.
Storming the Stronghold From Within
So, have computers shown creativity? It depends on your definitions. Even more than that, it depends on whether you’re willing to accept the possibility that a machine, with its accessible and understandable internal workings, can in principle be considered to be creative. The examples mentioned in this column aren’t the only ones, of course. To mention just one more, recent chess software has been praised by human chess grandmasters as playing in a highly creative style (see earlier column “Do we think they think?”). And yet, all this creativity, if we’re willing to agree that it is indeed creativity, can only be applied very narrowly: The chess programs cannot write poetry, the poetry software cannot prove mathematical theorems, etc. At least one more breakthrough is required in the field of AI, and it’s a very big one: reaching general-purpose capabilities. Humans can do it, at least to some extent; although not all poets are comfortable with mathematics or with chess, they’re likely to be able to be creative in some other areas except poetry, at least to the extent to which we’re all much more creative than any machine existing today.
Will this breakthrough ever happen? I believe that it will. What would it take to achieve it? Machines aren’t yet ready to generate the breakthroughs required to raise them to human levels of creativity. They cannot create a design for a creative machine precisely because such a design almost certainly requires the kind of general-purpose creativity that machines currently lack. Thus, humanity’s last stronghold, the walls of which separate our capabilities from machine capabilities, will only fall if we – scientists, programmers, philosophers, content creators, and (for all we know) poets and musicians – do our best to open its gates.
Euclid’s proof of the claim: “In isosceles triangles the angles at the base equal one another” (circa 300 BC):
A somewhat informal presentation of the proof: Given AB=AC. Extend lines AB and AC to create equal-length lines BD and CE, and draw the lines BE and CD. Since AB=AC (given) and BD=CE (by construction), AE=AD. Triangles ACD and ABE are congruent (AB=AC, AE=AD, common angle A). Therefore angles ABE and ACD are equal, and BE=CD. From the second fact we can now show that triangles BCE and CBD are congruent (shared side BC, BD=CE, BE=CD). Therefore angles CBE and DCB are equal. Angle subtraction gives us ABC=ABE-CBE, and ACB=ACD-DCB. Since the left-hand sides are equal, angles ABC and ACB are equal, and the proof is done.
Historical note: This is one of the very few propositions in Euclid’s Elements which has received its own name: Pons Asinorum, which is Latin for “Bridge of Asses”, and has become a generic term for “hurdle to learning”. The origin of the name is unclear – sometimes it is taken to refer to the difficulty of the proof, meaning that it is a bridge to more advanced topics and this bridge can’t be crossed by fools. In today’s high-school geometry textbooks, the theorem is given after more concepts and theorems are introduced. Therefore these texts give simpler proofs than Pons Asinorum.
Pappus’ proof of the same claim (circa 340 AD), re-discovered in 1956 by automated proof process:
Given AB=AC. Consider triangles ABC and ACB: though they seem to be the same triangle, they are mirror images of each other. They are congruent (AB=AC, BC is shared side, and AC=AB). Therefore angles ABC and ACB are equal, and the proof is done.
Note: The conceptual jump in this proof is the need to understand that the same triangle may be viewed as different triangles, depending on the sequence of points which we use to specify the triangle.
About the author: Israel Beniaminy has a Bachelor’s degree in Physics and Computer Science, and a Graduate degree in Computer Science. He develops advanced optimization techniques at ClickSoftware technologies, and has published academic papers on numerical analysis, approximation algorithms and artificial intelligence, as well as articles on fault isolation, service management and optimization in industry magazines.
Acknowledgement: Some of this text has previously appeared, in different form, in articles by this author which were published in Galileo, the Israeli Magazine of Science and Thought, and appears here with the kind permission of Galileo Magazine.