In this post I lay out the first part of a mathematical foundation for comparing the capabilities of machines to humans. Let me begin by saying that this is not intended to be a formal mathematical proof that machines can be more intelligent than humans and I have made no attempt at approaching this with any sort of mathematical rigor. The purpose of this exposition is simply to focus attention on existing mathematical concepts in the context of the ongoing discussion regarding the metrics we use to measure intelligence. I will later use this to show why I believe that machines will far outperform us no matter what criteria we might adopt in measuring intelligence. The notion of intelligence and intelligent behavior is an abstract one and any attempts to quantify intelligence in humans, animals, or other entities, both living and non-living, have fallen short of expectations. Mathematics has long served as a powerful tool for abstraction of concepts and manipulating those concepts in useful ways. We have successfully deployed it as a tool for working in domains where we have a solid understanding, such as chemistry, as well as domains in which we are still discovering their true nature, such as quantum field theory. Perhaps the most famous example of using mathematics to represent concepts which were not yet understood is Einstein’s theories of relativity. In this case we see an example of using mathematics to build a theory which predicts the existence of phenomenon which were formerly unknown. By representing the world around us in abstract mathematical terms, we are able to use the rich set of mathematical tools at our disposal to test hypotheses and follow them to their logical conclusion.
Representing Humans with Vectors
I begin by introducing a simple concept used in many domains throughout business and science alike – the vector. The notion of a vector is a simple one. It is a series of numbers which comprise a collection treated as a single unit. Although vectors can be manipulated and compared completely independently of any association with real world values, for our purposes we will assume that they represent something ‘real’. Choosing a simple example will make it easier to introduce some concepts typical of vectors . For example, we might use a vector to represent a person with respect to two values such as height and weight. Most of us are familiar with graphing values on an X/Y axis so we will use this to review some simple concepts before moving on to more useful examples. Let’s start with a vector to represent a person (P1) with a height of 72 inches and weight of 220 pounds and another person(P2) with a height of 76 inches and weight of 180 pounds, represented as:
P1 = [72, 220] P2 = [76, 180]
These two vectors represent our simplistic categorization as depicted in 2-dimensional space shown below.
We can easily see that P1 is greater in terms of weight and that P2 is greater in height. We can also see that taking the two attributes combined (if we assume that height and weight are equally important) P1 is slightly greater than P2 because the line representing P1 is longer than the line representing P2 . This measurement is known as the magnitude of a vector. We say that the magnitude of P1 is greater than the magnitude of P2 .
Now let’s take this comparison and apply it to a situation. Assume we are trying to choose between two athletes for a sports team such as a basketball or football team. In our simple world we only have two data points: height and weight. Also, we will assume that height and weight are equally important on our fantasy sports team. We can see that athlete 1 is heavier at 220 pounds but athlete 2 is taller at 76 inches. But as shown above, athlete 1 has a greater overall magnitude when we combine the two values. So by these very simple criteria athlete 1 is our clear choice. But even in our simple version of the world we would soon realize that there is more to selecting the better athlete than just comparing height and weight, so we begin looking for better criteria to improve our selection process. Perhaps one of the most obvious is strength. There are many ways to measure strength, but since we want to keep our examples simple, we will assume there is some test that gives us a strength index that goes from 0 to 500, where 500 is the strongest human alive. We find that our first athlete has a strength index of 300 and the other a strength index of 450. Our numerical representation of our two athletes now looks like this:
P1 = [72, 220, 300] P2 = [76, 180, 450]
Visually we can represent this in 3-dimensional space as shown below.
Just as in our 2-dimensional example, we can evaluate each vector in terms of magnitude, or overall length of the line shown from the origin to the endpoint. Given this added dimension in the two vectors, we can clearly see that the second athlete, P2 , is our better choice.
There are many details and considerations to take into account such as the number of data points beyond our simple 3-dimensional example and how to account for the fact that not all data points should be treated equally. For example, we may want to construct a model where strength is three times as important as weight. I will address these issues and more later on. But for now we have enough to put forth the basis for evaluating the level of intelligence in a human or a machine.
If we take our simple example and translate into the domain of intelligence we might choose to evaluate the three criteria previously suggested in Criteria for Intelligence: speed intelligence, collective intelligence, and quality intelligence. As in the above example where we simplified the notion of the athlete’s strength into a strength index, we will assume we have come up with a method of representing these three rather complex aspects of intelligence with a numerical index. For brevity’s sake, I will refer to these three values which correspond to the three different types of intelligence as speed, breadth, and quality. At this point we haven’t established any real meaning for these measurements so the values are arbitrary. Using the same notation as before to represent these values as a vector for two different people we have:
P1 = [72, 220, 300] P2 = [76, 180, 450]
Once again we can represent the intelligence of our two subjects in 3-dimensional space as shown below:
By the given criteria our second subject P2 is clearly the more intelligent overall as shown by the magnitude of the vector.
The above examples show, albeit in a very simplistic manner, a very straightforward and well-defined way of evaluating multiple characteristics to evaluate and compare two or more subjects using the mathematical structure known as a vector. This concept, along with other mathematical tools which I will introduce later on, have been used for hundreds of years in diverse fields ranging from mathematics and physics to business and finance. I intend to apply these resources in the ongoing effort to explore and ultimately give some clarity to the many questions that fall under the general heading of “What is Intelligence?”, as well as to further my case for the ultimate superiority of machines in the not so distant future.