You are here

The intermediate value theorem

20 January, 2016 - 10:05

I asserted that the The intermediate value theorem was really more a statement about the (real or hyperreal) number system than about functions. For insight, consider figure b, which is a geometrical construction that constitutes the proof of the very first proposition in Euclid’s celebrated Elements. The proposition to be proved is that given a line segment AB, it is possible to construct an equilateral triangle with AB as its base. The proof is by construction; that is, Euclid doesn’t just give a logical argument that convinces us the triangle must exist, he actually demonstrates how to construct it. First we draw a circle with center A and radius AB, which his third postulate says we can do. Then we draw another circle with the same radius, but centered at B. Pick one of the intersections of the circles and call it C. Construct the line segments AC and BC (postulate 1). Then AC equals AB by the definition of the circle, and likewise BC equals AB. Euclid also has an axiom that things equal to the same thing are equal to one another, so it follows that AC equals BC, and therefore the triangle is equilateral.

Figure 9.11 b / A proof from Euclid’s Elements.

It seems like a model of mathematical rigor, but there’s a flaw in the reasoning, which is that he assumes without justififcation that the circles do have a point in common. To see that this is not as secure an assumption as it seems, consider the usual Cartesian representation of plane geometry in terms of coordinates (x, y). Usually we assume that x and yare real numbers. What if we instead do our Cartesian geometry using rational numbers as coordinates? Euclid’s five postulates are all consistent with this. For example, circles do exist. Let A = (0, 0) and B = (1, 0). Then there are infinitely many pairs of rational numbers in the set that satisfies the definition of the circle centered at A. Examples include (3/5, 4/5) and (7/25, 24/25). The circle is also continuous in the sense that if I specify a point on it such as (7/25, 24/25), and a distance that I’m allowed to make as small as I please, say 10^{-6}, then other points exist on the circle within that distance of the given point. However, the intersection assumed by Euclid’s proof doesn’t exist. It would lie at (1/2, \sqrt{3}/2), but \sqrt{3} doesn’t exist in the rational number system.

In exactly the same way, we can construct counterexamples to the intermediate value theorem if the underlying system of numbers doesn’t have the same properties as the real numbers. For example, let y = x^{2} . Then y is a continuous function, on the interval from 0 to 1, but if we take the rational numbers as our foundation, then there is no x for which y = 1/2. The solution would be x = 1\sqrt{2}, which doesn’t exist in the rational number system. Notice the similarity between this problem and the one in Euclid’s proof. In both cases we have curves that cut one another without having an intersection. In the present example, the curves are the graphs of the functions y = x^{2} and y = 1/2.

The interpretation is that the real numbers are in some sense more densely packed than the rationals, and with two thousand years worth of hindsight, we can see that Euclid should have included a sixth postulate that expressed this density property. One possible way of stating such a postulate is the following. Let L be a ray, and O its endpoint. We think of O as the origin of the positive number line. Let P and Q be sets of points on L such that every point in P is closer to O than every point in Q. Then there exists some point Z on L such that Z lies at least as far from O as every point in P, but no farther than any point in Q. Technically this property is known as completeness. As an example, let P = {x| x^{2} <2} and Q = {x|  x^{2}  2}. Then the point Z would have to be \sqrt{2}, which shows that the rationals are not complete. The reals are complete, and the completeness axiom can serve as one of the fundamental axioms of the real numbers.

Note that the axiom refers to sets P and Q, and says that a certain fact is true for any choice of those sets; it therefore isn’t the type of proposition that is covered by the transfer principle, and in fact it fails for the hyperreals, as we can see if P is the set of all infinitesimals and Q the positive real numbers.

Here is a skeletal proof of the intermediate value theorem, in which I’ll make some simplifying assumptions and leave out some cases. We want to prove that if y is a continuous real-valued function on the real interval from a to b, and if ytakes on values y_{1} and y_{2} at certain points within this interval, then for any y_{3} between y_{1} and y_{2} , there is some real x in the interval for which y(x) = y_{3} . I’ll assume the case in which x^{1} < x^{2} and y_{1} <y_{2} .Define sets of real numbers P = {x|y y_{3} }, and let Q = {x|y≥ y_{3} }. For simplicity, I’ll assume that every member of P is less than or equal to every member of Q, which happens, for example, if the function y(x) is always increasing on the interval [a, b]. If P and Q intersect, then the theorem holds. Suppose instead that P and Q do not intersect. Using the completeness axiom, there exists some real x which is greater than or equal to every element of P and less than or equal to every element of Q. Suppose xbelongs to P. Then the following statement is in the right form for the transfer principle to apply to it: for any number x'>x, y(x') >y_{3} . We can conclude that the statement is also true for the hyperreals, so that if dxis a positive infinitesimal and x'= x+ dx, we have y(x) < y_{3} , but y(x+ dx) > y_{3} . Then by continuity, y(x) y(x+ dx) is infinitesimal. But y(x) y_{3} and y(x+ dx) > y_{3} , so the standard part of y(x) must equal y_{3} . By assumption ytakes on real values for real arguments, so y(x) = y_{3} . The same reasoning applies if xbelongs to Q, and since xmust belong either to P or to Q, the result is proved.

For an alternative proof of the intermediate value theorem by an entirely different technique, see Keisler (References).

As a side issue, we could ask whether there is anything like the interme- diate value theorem that can be applied to functions on the hyperreals. Our definition of Continuity  explicitly states that it only applies to real functions. Even if we could apply the definition to a function on the hyperreals, the proof given above would fail, since the hyperreals lack the completeness property. As a counterexample, let be some positive infinitesimal, and define a function y such that y = when st(x) 0 and y = everywhere else. If we insist on applying the definition of continuity to this function, it appears to be continuous, so it violates the intermediate value theorem. Note, however, that the way this function is defined is different from the way we usually define functions on the hyperreals. Usually we define a function on the reals, say y = x^{2} , in language to which the transfer principle applies, and then we use the transfer principle to reason about the function’s analog on the hyperreals. For instance, the function yx^{2} has the property that y0 everywhere, and the transfer principle guarantees that that’s also true if we take yx^{2} as the definition of a function on the hyperreals.

For functions defined in this way, the intermediate value theorem makes a statement that the transfer principle applies to, and it is therefore true for the hyperreal version of the function as well.