FormalPara Key Topics

Mathematical Induction

Strong and weak Induction

Base Case

Inductive Step

Recursion

Recursive Definition

Structural Induction

6.1 Introduction

Mathematical induction is an important proof technique used in mathematics, and it is often used to establish the truth of a statement for all natural numbers. There are two parts to a proof by induction, and these are the base step and the inductive step. The base case involves showing that the statement is true for some natural number (usually the number 1). The second step is termed the inductive step, and it involves showing that if the statement is true for some natural number n = k, then the statement is true for its successor n = k + 1. This is often written as P(k) → P(k + 1).

The statement P(k) that is assumed to be true when n = k is termed the inductive hypothesis. From the base step and the inductive step, we infer that the statement is true for all natural numbers (that are greater than or equal to the number specified in the base case). Formally, the proof technique used in mathematical induction is of the formFootnote 1:

$$(P(1) \wedge (P(k) \to P(k + 1)) \to \forall nP(n).$$

Mathematical induction (weak induction) may be used to prove a wide variety of theorems and especially theorems of the form ∀n P(n). It may be used to provide a proof of theorems about summation formulae, inequalities, set theory, and the correctness of algorithms and computer programs. One of the earliest inductive proofs was the sixteenth-century proof that the sum of the first n odd integers is n2, which was proved by Francesco Maurolico in 1575. Later mathematicians made the method of mathematical induction more precise.

We distinguish between strong induction and weak induction, where strong induction also has a base case and an inductive step, but the inductive step is a little different. It involves showing that if the statement is true for all natural numbers less than or equal to an arbitrary number k, then the statement is true for its successor + 1. Weak induction involves showing that if the statement is true for some natural number n = k, then the statement is true for its successor n = + 1. Structural induction is another form of induction, and this mathematical technique is used to prove properties about recursively defined sets and structures.

Recursion is often used in mathematics to define functions, sequences, and sets. However, care is required with a recursive definition to ensure that it actually defines something, and that what is defined makes sense. Recursion defines a concept in terms of itself, and we need to ensure that the definition is not circular (i.e., that it does not lead to a vicious circle).

Recursion and induction are closely related and are often used together. Recursion is extremely useful in developing algorithms for solving complex problems, and induction is a useful technique in verifying the correctness of such algorithms.

Example 6.1

Show that the sum of the first n natural numbers is given by the formula:

$$ 1 + 2 + 3 + \cdots + n = \frac{n(n + 1)}{2} $$

Proof

Base Case

We consider the case where n = 1 and clearly \(1 = \frac{1(1 + 1)}{2}\) and so the base case P(1) is true.

Inductive Step

Suppose the result is true for some number k, then we have P(k)

$$ 1 + 2 + 3 + \cdots + k = \frac{k(k + 1)}{2} $$

Then consider the sum of the first k+1 natural numbers, and we use the inductive hypothesis to show that its sum is given by the formula.

$$ \begin{aligned} & 1 + 2 + 3 + \cdots + k + (k + 1) \\ & \quad = \frac{k(k + 1)}{2} + (k + 1)\quad {\text{(by}}\,{\text{inductive}}\,{\text{hypothesis)}} \\ & \quad = \frac{k^2 + k}{2} + \frac{(2k + 2)}{2} \\ & \quad = \frac{k^2 + 3k + 2}{2} \\ & \quad = \frac{(k + 1)(k + 2)}{2} \\ \end{aligned} $$

Thus, we have shown that if the formula is true for an arbitrary natural number k, then it is true for its successor + 1. That is, P(k) → P(k+1). We have shown that P(1) is true, and so it follows from mathematical induction that P(2), P(3), …. are true, and so P(n) is true, for all natural numbers and the theorem is established.

Note 6.1

There are opportunities to make errors in proofs with induction, and the most common mistakes are not to complete the base case or inductive step correctly. These errors can lead to strange results, and so care is required. It is important to be precise in the statements of the base case and inductive step.

Example 6.2

(Binomial Theorem) Prove the binomial theorem using induction (permutations and combinations are discussed in Chap. 8). That is,

$$ (1 + x)^n = 1 + \left( {\begin{array}{*{20}c} n \\ 1 \\ \end{array} } \right)x + \left( {\begin{array}{*{20}c} n \\ 2 \\ \end{array} } \right)x^2 + \cdots + \left( {\begin{array}{*{20}l} n \hfill \\ r \hfill \\ \end{array} } \right)x^r + \cdots + \left( {\begin{array}{*{20}l} n \hfill \\ n \hfill \\ \end{array} } \right)x^n $$

Proof

Base Case

We consider the case where n = 1 and clearly \((1 + x)^1 = (1 + x) = 1 + \left( {\begin{array}{*{20}l} 1 \hfill \\ 1 \hfill \\ \end{array} } \right)x^1\), and so the base case P(1) is true.

Inductive Step

Suppose the result is true for some number k, then we have P(k)

$$ (1 + x)^k = 1 + \left( {\begin{array}{*{20}c} k \\ 1 \\ \end{array} } \right)x + \left( {\begin{array}{*{20}c} k \\ 2 \\ \end{array} } \right)x^2 + \cdots + \left( {\begin{array}{*{20}c} k \\ r \\ \end{array} } \right)x^r + \cdots + \left( {\begin{array}{*{20}c} k \\ k \\ \end{array} } \right)x^k $$

Then consider (1 + x)k+1 and we use the inductive hypothesis to show that it is given by the formula.

$$ \begin{aligned} (1 + x)^{k + 1} & = (1 + x)^k (1 + x) \\ & = \left( {1 + \left( {\begin{array}{*{20}c} k \\ 1 \\ \end{array} } \right)x + \left( {\begin{array}{*{20}c} k \\ 2 \\ \end{array} } \right)x^2 + \cdots + \left( {\begin{array}{*{20}c} k \\ r \\ \end{array} } \right)x^r + \cdots + \left( {\begin{array}{*{20}c} k \\ k \\ \end{array} } \right)x^k } \right)\;\;\ (1 + x) \\ & = \left( {1 + \left( {\begin{array}{*{20}c} k \\ 1 \\ \end{array} } \right)x + \left( {\begin{array}{*{20}c} k \\ 2 \\ \end{array} } \right)x^2 + \cdots + \left( {\begin{array}{*{20}l} k \hfill \\ r \hfill \\ \end{array} } \right)x^r + \cdots + \left( {\begin{array}{*{20}l} k \hfill \\ k \hfill \\ \end{array} } \right)x^k } \right) \\ & \quad + x + \left( {\begin{array}{*{20}c} k \\ 1 \\ \end{array} } \right)x^2 + \cdots + \left( {\begin{array}{*{20}c} k \\ r \\ \end{array} } \right)x^{r + 1} + \cdots + \left( {\begin{array}{*{20}c} k \\ k \\ \end{array} } \right)x^{k + 1} \\ & = 1 + \left( {\begin{array}{*{20}c} k \\ 1 \\ \end{array} } \right)x + \left( {\begin{array}{*{20}c} k \\ 2 \\ \end{array} } \right)x^2 + \cdots + \left( {\begin{array}{*{20}l} k \hfill \\ r \hfill \\ \end{array} } \right)x^r + \cdots + \left( {\begin{array}{*{20}l} k \hfill \\ k \hfill \\ \end{array} } \right)x^k \\ & \quad + \left( {\begin{array}{*{20}c} k \\ 0 \\ \end{array} } \right)x + \left( {\begin{array}{*{20}c} k \\ 1 \\ \end{array} } \right)x^2 + \cdots + \left( {\begin{array}{*{20}c} k \\ {r - 1} \\ \end{array} } \right)x^r + \cdots + \left( {\begin{array}{*{20}c} k \\ {k - 1} \\ \end{array} } \right)x^k + \left( {\begin{array}{*{20}l} k \hfill \\ k \hfill \\ \end{array} } \right)x^{k + 1} \\ & = 1 + \left( {\begin{array}{*{20}c} {k + 1} \\ 1 \\ \end{array} } \right)x + \cdots + \left( {\begin{array}{*{20}c} {k + 1} \\ r \\ \end{array} } \right)x^r + \cdots + \left( {\begin{array}{*{20}c} {k + 1} \\ k \\ \end{array} } \right)x^k + \left( {\begin{array}{*{20}l} {k + 1} \hfill \\ {k + 1} \hfill \\ \end{array} } \right)x^{k + 1} \\ \end{aligned} $$

(which follows from Exercise 7 below).

Thus, we have shown that if the binomial theorem is true for an arbitrary natural number k, then it is true for its successor k+1. That is, P(k) → P(k+1). We have shown that P(1) is true, and so it follows from mathematical induction that P(n) is true, for all natural numbers, and so the theorem is established.

The standard formula of the binomial theorem (x + y)n follows immediately from the formula for (1 + x)n, by noting that (x + y)n = {x(1 + y/x)}n = xn(1 + y/x)n.

6.2 Strong Induction

Strong induction is another form of mathematical induction, which is often employed when we cannot prove a result with (weak) mathematical induction. It is similar to weak induction in that there is a base step and an inductive step. The base step is identical to weak mathematical induction, and it involves showing that the statement is true for some natural number (usually the number 1). The inductive step is a little different, and it involves showing that if the statement is true for all natural numbers less than or equal to an arbitrary number k, then the statement is true for its successor + 1. This is often written as (P(1) ∧ P(2) ∧ … ∧ P(k)) → P(+ 1).

From the base step and the inductive step, we infer that the statement is true for all natural numbers (that are greater than or equal to the number specified in the base case). Formally, the proof technique used in mathematical induction is of the formFootnote 2:

$$(P(1) \wedge [(P(1) \wedge P(2) \wedge \cdots \wedge P(k)) \to P(k + 1)]) \to \forall nP(n).$$

Strong and weak mathematical induction are equivalent in that any proof done by weak mathematical induction may also be considered a proof using strong induction, and a proof conducted with strong induction may also be converted into a proof using weak induction.

Weak mathematical induction is generally employed when it is reasonably clear how to prove P(+ 1) from P(k), with strong mathematical typically employed where it is not so obvious. The validity of both forms of mathematical induction follows from the well-ordering property of the natural numbers, which states that every non-empty set has a least element.

Well-Ordering Principle

Every non-empty set of natural numbers has a least element. The well-ordering principle is equivalent to the principle of mathematical induction.

Example 6.3

Show that every natural number greater than one is divisible by a prime number.

Proof

Base Case

We consider the case of n = 2 which is trivially true, since 2 is a prime number and is divisible by itself.

Inductive Step (strong induction)

Suppose that the result is true for every number less than or equal to k. Then we consider + 1, and there are there are two cases to consider. If + 1 is prime, then it is divisible by itself. Otherwise it is composite, and it may be factored as the product of two numbers each of which is less than or equal to k. Each of these numbers is divisible by a prime number by the strong inductive hypothesis, and so + 1 is divisible by a prime number.

Thus, we have shown that if all natural numbers less than or equal to k are divisible by a prime number, then + 1 is divisible by a prime number. We have shown that the base case P(2) is true, and so it follows from strong mathematical induction that every natural numbers greater than one is divisible by some prime number.

6.3 Recursion

Some functions (or objects) used in mathematics (e.g., the Fibonacci sequence) are difficult to define explicitly and are best defined by a recurrence relation: (i.e., an equation that recursively defines a sequence of values, once one or more initial values are defined). Recursion may be employed to define functions, sequences, and sets.

There are two parts to a recursive definition, namely the base case and the recursive step. The base case usually defines the value of the function at n = 0 or n = 1, whereas the recursive step specifies how the application of the function to a number may be obtained from its application to one or more smaller numbers.

It is important that care is taken with the recursive definition, to ensure that that it is not circular, and does not lead to an infinite regress. The argument of the function on the right-hand side of the definition in the recursive step is usually smaller than the argument on the left-hand side to ensure termination (there are some unusual recursively defined functions such as the McCarthy 91 function where this is not the case).

It is natural to ask when presented with a recursive definition whether it means anything at all, and in some cases, the answer is negative. Fixed-point theory provides the mathematical foundations for recursion and ensures that the functions/objects are well defined.

Chapter 12 (see Sect. 12.6) discusses various mathematical structures such as partial orders, complete partial orders, and lattices, which may be employed to give a secure foundation for recursion. A precise mathematical meaning is given to recursively defined functions in terms of domains and fixed-point theory, and it is essential that the conditions in which recursion may be used safely be understood. The reader is referred to [1] for more detailed information.

A recursive definition will include at least one non-recursive branch with every recursive branch occurring in a context that is different from the original and brings it closer to the non-recursive case. Recursive definitions are a powerful and elegant way of giving the denotational semantics of language constructs.

Next, we present examples of the recursive definition of the factorial function and Fibonacci numbers.

Example 6.4

(Recursive Definition of Functions) The factorial function n! is very common in mathematics, and its well-known definition is n! = n(− 1)(− 2) … 3.2.1 and 0! = 1. The formal definition in terms of a base case and inductive step is given as follows:

Base Step:

fac (0) = 1

Recursive Step:

fac (n) = n * fac(− 1)

This recursive definition defines the procedure by which the factorial of a number is determined from the base case, or by the product of the number by the factorial of its predecessor. The definition of the factorial function is built up in a sequence: fac(0), fac(1), fac(2), …

The Fibonacci sequenceFootnote 3 is named after the Italian mathematician Fibonacci, who introduced it in the 13th century. It had been previously described in Indian mathematics, and the Fibonacci numbers are the numbers in the following integer sequence:

$$ 1,1,2,3,5,8,13,21,34 $$

Each Fibonacci number (apart from the first two in the sequence) is obtained by adding the two previous Fibonacci numbers in the sequence together. Formally, the definition is given by

Base Step:

F1 = 1, F2 = 1

Recursive Step:

Fn = Fn−1 + Fn−2     (Definition for when n > 2)

Example 6.5

(Recursive Definition of Sets and Structures) Sets and sequences may also be defined recursively, and there are two parts to the recursive definition (as before). The base case specifies an initial collection of elements in the set, whereas the recursive step provides rules for adding new elements to the set based on those already there. Properties of recursively defined sets may often be proved by a technique called structural induction.

Consider the subset S of the natural numbers defined by

Base Step:

5 ∈ S

Recursive Step:

For x S then x + 5 ∈ S

Then the elements in S are given by the set of all multiples of 5, as clearly 5 ∈ S; therefore by the recursive step 5 + 5 = 10 ∈ S; 5 + 10 = 15 ∈ S, and so on.

The recursive definition of the set of strings Σ* over an alphabet Σ is given by

Base Step:

Λ ∈ Σ* (Λ is the empty string)

Recursive Step:

For σ ∈ Σ*and v ∈ Σ then σv ∈ Σ*

Clearly, the empty string is created from the base step. The recursive step states that a new string is obtained by adding a letter from the alphabet to the end of an existing string in Σ*. Each application of the inductive step produces a new string that contains one additional character. For example, if Σ = {0, 1}, then the strings in Σ* are the set of bit strings Λ, 0, 1, 00, 01, 10, 11, 000, 001, 010, etc.

We can define an operation to determine the length of a string (len: Σ*→ ℕ) recursively.

Base Step:

len (Λ) = 0

Recursive Step:

len (σv) = len(σ) + 1 (where σ ∈ Σ* and v ∈ Σ)

A binary treeFootnote 4 is a well-known data structure in computer science, and it consists of a root node together with a left and right binary tree. It is defined as a finite set of nodes (starting with the root node), where each node consists of a data value and a link to a left subtree and a right subtree. Recursion is often used to define the structure of a binary tree.

Base Step:

A single node is a binary tree (root)

Recursive Step:

(i) Suppose X and Y are binary trees and x is a node then XxY is a binary tree, where X is the left subtree, Y the right subtree, and x is the new root node.

(ii) Suppose X is a binary tree and x is a node, then xX and Xx are binary trees, which consist of the root node x and a single child left or right subtree.

That is, a binary tree has a root node, and it may have no subtrees; it may consist of a root node with a left subtree only; a root node with a right subtree only; or a root node with both a left and right subtree.

6.4 Structural Induction

Structural induction is a mathematical technique that is used to prove properties about recursively defined sets and structures. It may be used to show that all members of a recursively defined set have a certain property, and there two parts to the proof (as before), namely the base case and the recursive (inductive) step.

The first part of the proof is to show that the property holds for all elements specified in the base case of the recursive definition. The second part of the proof involves showing that if the property is true for all elements used to construct the new elements in the recursive definition, then the property holds for the new elements. From the base case and the recursive step we deduce that the property holds for all elements of the set (structure).

Example 6.6

(Structural Induction) We gave a recursive definition of the subset S of the natural numbers that consists of all multiples of 5. We did not prove that all elements of the set S are divisible by 5, and we use structural induction to prove this.

Base Step :

5 ∈ S (and clearly the base case is divisible by 5)

Inductive Step :

Suppose qS then q = 5 k for some k. From the inductive hypothesis q + 5 ∈ S and q + 5 = 5 k + 5 = 5 (k +1) and so q + 5 is divisible by 5

Therefore, all elements of S are divisible by 5.

6.5 Review Questions

  1. 1.

    Show that 9n + 7 is always divisible by 8.

  2. 2.

    Show that the sum of \(1^2 + 2^2 + \cdots + n^2 = n(n + 1)(2n + 1)/6\).

  3. 3.

    Explain the difference between strong and weak induction.

  4. 4.

    What is structural induction?

  5. 5.

    Explain how recursion is used in mathematics.

  6. 6.

    Investigate the recursive definition of the McCarthy 91 function, and explain how it differs from usual recursive definitions.

  7. 7.

    Show that \(\left( {\begin{array}{*{20}l} r \hfill \\ r \hfill \\ \end{array} } \right) + \left( {\begin{array}{*{20}c} n \\ {r - 1} \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {n + 1} \\ r \\ \end{array} } \right)\)

  8. 8.

    Determine the standard formula for the binomial theorem (x + y)n from the formula for (1 + x)n.

6.6 Summary

Mathematical induction is an important proof technique that is used to establish the truth of a statement for all natural numbers. There are two parts to a proof by induction, and these are the base case and the inductive step. The base case involves showing that the statement is true for some natural number (usually for the number n = 1). The inductive step involves showing that if the statement is true for some natural number n = k, then the statement is true for its successor n = + 1.

From the base step and the inductive step, we infer that the statement is true for all natural numbers (that are greater than or equal to the number specified in the base case). Mathematical induction may be used to prove a wide variety of theorems, such as theorems about summation formulae, inequalities, set theory, and the correctness of algorithms and computer programs.

Strong induction is often employed when we cannot prove a result with (weak) mathematical induction. It also has a base case and an inductive step, where the inductive step is a little different, and it involves showing that if the statement is true for all natural numbers less than or equal to an arbitrary number k, then the statement is true for its successor + 1.

Recursion may be employed to define functions, sequences, and sets in mathematics, and there are two parts to a recursive definition, namely the base case and the recursive step. The base case usually defines the value of the function at n = 0 or n = 1, whereas the recursive step specifies how the application of the function to a number may be obtained from its application to one or more smaller numbers. It is important that care is taken with the recursive definition, to ensure that that it is not circular, and does not lead to an infinite regress.

Structural induction is a mathematical technique that is used to prove properties about recursively defined sets and structures. It may be used to show that all members of a recursively defined set have a certain property, and there are two parts to the proof, namely the base case and the recursive (inductive) step.