2.1 Introduction

To delve into the concept of ‘distance’ in general spaces, it is necessary to first define the notion of a ‘space.’

Definition 2.1

A space is defined to be any nonempty set.

The definition of a space is often avoided in most of the textbooks. Some texts define ‘space’ as a nonempty set with some additional structure on it, such as a metric space, linear space, normed space. The term ‘additional structure’ is a bit vague, especially for those who do not know any kind of such ‘space’. In that sense, the above definition appears more appropriate.

Definition 2.2

Let X be a nonempty  set. A function \(d:X\times X\longrightarrow \mathbb {R}\) is said to be a metric on X if for every \(x,y,z\in X,\) we have

  1. (a)

    \(d(x,y)\ge 0,\) \(d(x,y)= 0\) if and only if \(x=y,\) (positive definiteness)

  2. (b)

    \( d(x,y)=d(y,x) \) and (symmetry)

  3. (c)

    \(d(x,y)\le d(x,z)+d(z,y).\)(triangle inequality)

In this case, (Xd) is called a metric space or that X is a metric space with metric d.  If there is no ambiguity about the metric, we simply say that X is a metric space.

Examples 2.3

  1. (a)

    Define \(d(x,y):=|x-y|, \) for all \(x,y\in \mathbb {R}.\) Then d is a metric on \(\mathbb {R},\) known as the usual metric.  

  2. (b)

    Let X be any nonempty set (it could even be the set of English Alphabets) and \(d_c:X\times X:\longrightarrow \mathbb {R}\) be defined as follows: 

    $$d_c(x,y):=\left\{ \begin{array}{lll} &{} 1 &{}; x\ne y,\\ &{} 0 &{}; x= y. \end{array}\right. $$

    It can be shown that \(d_c\) is a metric on X. In this case, \(d_c\) is said to be the discrete metric on X and \((X,d_c)\) is called the discrete metric space.

Example 2.4

Let (Xd) be any metric space and Y be a nonempty subset of X. Then d is also a metric on Y,  known as the induced metric. In this case, the metric space Y is called a subspace of X. For example, \(\mathbb {Q}\) is a subspace \(\mathbb {R}.\)

In the sequel, if X is a nonempty subset of \(\mathbb {R},\) the space X will refer to the metric space X equipped with the usual metric.

Example 2.5

Let \(r\in (0,1]\) and X be the collection of sequences with terms 0 or 1. For any sequences \(x=\{x_n\}, y=\{y_n\}\in X\) such that \(x\ne y,\) define \(n(x,y):=\min \{k:x_k\ne y_k\}\) and

$$\rho _0(x,y):= \left\{ \begin{array}{lll} &{} 0 &{}; x=y,\\ &{} \frac{1}{n(x,y)} &{}; \text{ otherwise } \end{array}\right. \text { and }\rho _r(x,y):= \left\{ \begin{array}{lll} &{} 0 &{}; x=y,\\ &{} r^{n(x,y )}&{}; \text{ otherwise. } \end{array}\right. $$

Then for every \(r\in [0,1],\) the function \(\rho _r\) is a metric on X with

$$\begin{aligned} \rho _r(x,y)\le \max \{\rho _r(x,z),\rho _r(y,z)\}\text { for all }x,y,z\in X. \end{aligned}$$
(2.1)

This above inequality is known as the strong triangle inequality, and a metric that satisfies it is referred to as an ultrametric.

Proof

Note that for \(r=1,\) \(\rho _1\) is the discrete metric on X,  which clearly satisfies the inequality (2.1). The symmetry and positive definiteness of each \(\rho _r\) is trivial. The triangle inequality follows from (2.1), which is immediate if any two of xyz are equal.

Assume that \(x=\{x_n\}, y=\{y_n\}\) and \( z=\{z_n\}\) are all different. Then \(x_i=z_i\) for all \(i<n(x,z)\) and \(z_i=y_i\) for all \(i<n(z,y).\) Therefore \(x_i=y_i\) for all \(i<\min \{n(x,z),n(z,y)\},\) and consequently, \(n(x,y)\ge \min \{n(x,z),n(z,y)\}.\) Hence \(\rho _r\) satisfies (2.1), for all \( r\in [0,1).\)    \(\square \)

Definition 2.6

Let X be a nonempty set. A function \(d:X\times X\longrightarrow \mathbb {R}\) is said to be a pseudo-metric on X if for every \(x,y,z\in X\) we have

  1. (a)

    \(d(x,y)\ge 0\) and \(d(x,x)= 0,\) (positive semi-definiteness)

  2. (b)

    \( d(x,y)=d(y,x) \) and (symmetry)

  3. (c)

    \(d(x,y)\le d(x,z)+d(z,y).\)(triangle inequality)

Clearly, every metric is a pseudo-metric, while the converse is not true.

Example 2.7

Let \(d(x,y):=|x^2-y^2|\text { for all }x,y\in \mathbb {R}.\) Then d is a pseudo-metric on \(\mathbb {R},\) but not a metric on \(\mathbb {R}.\)

Remarks 2.8

Some of the requirements in Definition 2.2 are redundant (see Exercise 2.7). An important example of a metric, the Hausdorff metric will be provided in Exercise 3.67.

2.1.1 The Euclidean Spaces

Note that the standard Euclidean distance in a plane satisfies all the requirements of a metric, making that plane a metric space. The positive definiteness and the symmetry are obvious. We shall provide a proof for the triangle inequality.

Let \(n\in \mathbb {N}.\) The n-dimensional real Euclidean space \(\mathbb {R}^n\) is defined as

$$\mathbb {R}^n:=\{(x_1,\dots , x_n):x_i\in \mathbb {R},1\le i\le n\}.$$

For \(x= (x_1,\dots , x_n), y= (y_1,\dots , y_n)\in \mathbb {R}^n\) and \(r\in \mathbb {R},\) the sum \(x+y,\) scalar multiplication rxmodulus |x| and the dot product\(x\cdot y\) are defined as follows:

$$\begin{aligned} x+y&:=(x_1+y_1,\dots , x_n+y_n),\\ rx&:= (rx_1,\dots , rx_n), \\ |x|&:= \sqrt{x^2_1+\dots + x^2_n}\\ \text { and }x \cdot y&:=x_1y_1+\dots + x_ny_n. \end{aligned}$$

First we present the Cauchy-Schwarz inequality.  This is one of the most fundamental inequality in analysis and has several conceptually different proofs. Here we present the popular one, which can be extended to even more general spaces, namely the inner product spaces (see Theorem 2.33). An alternative proof will be provided in Exercise 2.19.

Theorem 2.9

(Cauchy-Schwarz inequality)  For every \(x,y\in \mathbb {R}^n,\) we have

$$\begin{aligned} |x\cdot y|\le |x| |y|. \end{aligned}$$
(2.2)

In other words, if \(x=(x_1,\dots ,x_n), y=(y_1,\dots ,y_n)\in \mathbb {R}^n,\) then

$$ |x_1y_1+\dots + x_ny_n|\le \sqrt{x_1^2+\dots +x_n^2}\sqrt{y_1^2+\dots +y_n^2}.$$

Further, the equality holds if and only if x and y are linearly dependent,that is, there exist real numbers a and b not both zero such that \(ax +by=0.\)

Proof

Consider \(z:=|y|^2 x- (x\cdot y) y\) and observe that

$$\begin{aligned} 0\le |z|^2 =z\cdot z =\big (|y|^2 x- (x\cdot y) y\big )\cdot \big (|y|^2 x- (x\cdot y) y\big )=|y|^2 \big (|x|^2 |y|^2 -| x\cdot y|^2\big ). \end{aligned}$$
(2.3)

If \(y=0,\) then (2.2) holds trivially. If \(y\ne 0,\) then \(|y|^2= y\cdot y > 0\) and therefore by cancelling the positive scalar \(|y|^2\) from (2.3), we obtain (2.2).

Suppose there exist real numbers a and b not both zero such that \(ax +by=0.\) Without loss of generality, we assume that \(a\ne 0.\) Then with \(x=-by/a,\) the equality in (2.2) holds true.

Conversely, assume that the equality holds in (2.2). Using that in (2.3), we obtain \( z\cdot z =0,\) which implies \(z=0.\) Hence \((y\cdot y)x=(x\cdot y) y.\) If \(y\ne 0,\) then \(y\cdot y\ne 0.\) Otherwise \(0.x+1.y=0.\) Hence x and y are linearly dependent.    \(\square \)

Corollary 2.10

(Minkowski’s inequality) For every \((x_1,\dots , x_n), (y_1,\dots , y_n)\in \mathbb {R}^n,\) we have 

$$\begin{aligned} \sqrt{(x_1+y_1)^2+\dots +(x_n+y_n)^2}\le \sqrt{x_1^2+\dots +x_n^2}+\sqrt{y_1^2+\dots +y_n^2}. \end{aligned}$$
(2.4)

Proof

By squaring and canceling, we observe that (2.4) holds if and only if (2.2) is satisfied, which is already true. Hence the result.   \(\square \)

Corollary 2.11

(Euclidean metric) For every \(x=(x_1,\dots , x_n), y=(y_1,\dots , y_n)\in \mathbb {R}^n,\) define 

$$d_2\big (x,y\big ):=\sqrt{|x_1-y_1|^2+\dots +|x_n-y_n|^2}.$$

Then \((\mathbb {R}^n,d_2)\) is a metric space.

Proof

Applying Corollary 2.10, \(d_2\) satisfies the triangle inequality.The positive definiteness and symmetry of \(d_2\) are obvious from its definition.    \(\square \)

The above \(d_2\) is known as the usual metricor the Euclidean metric  on \(\mathbb {R}^n.\) For convenience, we write metric space \(\mathbb {R}^n\) for the metric space \((\mathbb {R}^n,d_2).\) We also write \(|x-y|\) for \(d_2(x,y).\)

We wind up this section with the space of complex numbers. Various other examples of metric spaces will be discussed in the exercises.

Definition 2.12

The set of complex numbers \(\mathbb {C}\) is defined to be the two-dimensional Euclidean space \(\mathbb {R}^2,\) along with an additional multiplication operation given by

$$(x_1,x_2)\times (y_1,y_2):=(x_1y_1-x_2y_2, x_1y_2+x_2y_1)\text { for all }(x_1,x_2), (y_1,y_2)\in \mathbb {C}.$$

It is conventional to denote (0, 1) by i and (ab) by \(a+ib.\)

Remark 2.29

Note that the usual metric on \(\mathbb {R}^2\) provides a metric on \(\mathbb {C},\)   also known as the usual metric on \(\mathbb {C}.\) Therefore, topologically \(\mathbb {C}\) and \(\mathbb {R}^2\) are same. The product in \(\mathbb {C}\) makes functions on \(\mathbb {C}\) quite different from those on \(\mathbb {R}^2,\) which leads to the Cauchy theory of complex analysis. However, that is not the concern of this textbook. We limit our discussion to the basic algebraic and topological properties of \(\mathbb {C}.\)

History Notes 2.14

The concept of metric spaces was introduced by Frećhet, under the name ‘classes (E)’, in his 1906 Ph.D. dissertation. Later Hausdorff coined the term metric space in 1914 and laid the foundations of topology (see [1, p. 253]).

2.1.2 Balls and Bounded Sets

Definition 2.15

Let (Xd) be a metric space, \(x\in X \) and \(r > 0.\) The ball of radius r centered at x is defined as

$$B_d(x; r):=\{y\in X:d(y,x)<r\}.$$

These balls are also called open balls. If there is no ambiguity about the metric, we simply write B(xr),  instead of \(B_d(x;r).\) Note that we did not allow balls with radius zero.

Examples 2.16

  1. (a)

    Under the usual metric on reals, the open balls are open intervals. In particular, for all \( x\in \mathbb {R}\text { and }r>0,\) we have \(B(x;r)=(x-r,x+r).\)

  2. (b)

    Let (Xd) be a discrete metric space, \(x\in X\) and \(r>0.\) Then

    $$\begin{aligned} B(x;r):=\left\{ \begin{array}{lll} &{} X &{}; r>1,\\ &{} \{x\} &{}; 0 < r \le 1. \end{array}\right. \end{aligned}$$

Example 2.17

Let (Xd) be a metric space such that d is an ultrametric on X,  i.e.

$$d(x,y)\le \max \{d(x,z),d(y,z)\}\text { for all }x,y,z\in X.$$

If \(x,y,z\in X\) and \(r,s>0\) are arbitrary, then X satisfies the following properties:

  1. (a)

    Every triangle in X is isosceles, i.e. if \(d(x,y)\ne d(y,z),\) then d(zx) is equal to either d(xy) or d(yz).

  2. (b)

    Every point inside a ball is its center, i.e. \(B(x;r)=B(y;r)\text { for all }y\in B(x;r).\)

  3. (c)

    If two balls meet, then one is contained in the other; i.e. if \(B(x;r)\cap B(y;s)\ne \emptyset ,\) then \(\text{ either } B(x;r)\subset B(y;s) \text{ or } B(y;s)\subset B(x;r).\)

Proof

  1. (a)

    Without loss of generality, we assume that \(d(x,y) < d(y,z).\) Then \(d(y,z)=d(z,x),\) as

    $$\begin{aligned} d(z,x)\le & \max \{ d(z,y), d(y,x)\}= d(y,z) \\ \text { and }d(y,z) \le & \max \{d(y,x), d(x,z)\} =d(z,x). \end{aligned}$$
  2. (b)

    Suppose \(d(y,x)<r.\) If \(z\in B(x;r),\) then \(d(y,z) \le \max \{d(y,x), d(x,z)\} <r\) and hence \(z\in B(y;r).\) So \(B(x;r)\subset B(y;r).\) Interchanging y and x,  we obtain \(B(x;r)= B(y;r).\)

  3. (c)

    Without loss of generality, suppose \(r\le s\) and let \(z\in B(x;r)\cap B(y;s).\) By (b), we conclude that \(B(x;r)=B(z;r)\subset B(z;s) = B(y;s).\)    \(\square \)

Definition 2.18

A subset E of a metric space X is called bounded if it is contained in some ball.  That is, \(E\subset B(x;r)\text { for some }x\in X\) and \(r>0.\)

Therefore, E is bounded if and only if the set of distance between points of E is bounded above. Analogous to open balls, the closed balls are defined as follows:

Definition 2.19

Let (Xd) be a metric space, \(x\in X \) and \(r > 0.\) The closed ball of radius r centered at x is defined as \(B[x; r]:=\{y\in X:d(y,x)\le r\}.\)

Exercise 2.1

For a metric space (Xd),  prove that the following are equivalent:

  1. (a)

    d is a constant,

  2. (b)

    X is a singleton set and

  3. (c)

    \( d(x,y)\ge d(x,z)+d(z,y) \text { for all }x,y,z\in X.\)

Exercise 2.2

If d is a metric on a space X,  prove that so is \(\sqrt{d}.\)

Exercise 2.3

If d is a metric on a space X and \(x,y,z\in X.\) prove the inequality \(|d(x,y)-d(y,z)|\le d(x,z).\)

Exercise 2.4

Does \((x,y)\longmapsto \big |\frac{1}{x}-\frac{1}{y}\big |\) define a metric on \(\mathbb {R}\setminus \{0\}?\)

Exercise 2.5

Does any of the following expressions define a metric on \(\mathbb {R}:\)

$$|x^2-y^2|, |x-y|+1 \text{ or } \frac{1}{|x-y|+1}?$$

Exercise 2.6

Prove that \((x,y)\longmapsto |x-y|+|x^2-y^2|\) defines a metric on \(\mathbb {R}.\)

Exercise 2.7

If X is nonempty and \(d:X\times X\longrightarrow \mathbb {R}\) such that for all \(x,y\in X,\)

$$\begin{aligned} d(x,y)&\le d(x,z)+d(y,z)\\ \text { and }d(x,y)&= 0 \text{ if } \text{ and } \text{ only } \text{ if } x=y. \end{aligned}$$

Prove that \(d(x,y)\ge 0\) and \(d(x,y)=d(y,x)\text { for all }x,y\in X.\)

Exercise 2.8

Deduce the triangle inequality in \((\mathbb {R}^n,d_2)\) from Corollary 2.10.

Exercise 2.9

For any \((x_1,x_2),(y_1,y_2)\in \mathbb {R}^2,\) define

  1. (a)

    \(d_1\big ((x_1,x_2),(y_1,y_2)\big ):=|x_1-y_1|+|x_2-y_2|\) (the taxi cab metric). 

  2. (b)

    \(d_\infty \big ((x_1,x_2),(y_1,y_2)\big ):=\max \{|x_1-y_1|,|x_2-y_2|\}.\) (the sup metric). 

Prove that \(d_1\) and \(d_\infty \) are metrics on \(\mathbb {R}^2.\)

Exercise 2.10

Generalize the metrics \(d_1\) and \(d_\infty \) of Exercise 2.9 to \(\mathbb {R}^n,\) and characterize the collection of balls in \(\mathbb {R}^n\) with respect to these metrics.

Exercise 2.11

Let (Xd) be a metric space and \(E\subset X.\) Prove that the following are equivalent:

  1. (a)

    E is bounded,

  2. (b)

    there exists some \(M>0\) such that \(d(x,y)<M,\) for every \(x,y\in E,\)

  3. (c)

    for any \(x\in X,\) there exists \(M_x>0\) such that \(d(y,x)<M_x\text { for all }y\in E.\)

Exercise 2.12

Characterize bounded subsets of discrete metric spaces.

Exercise 2.13

If A and B are bounded subsets of a metric space X,  prove that so is \(A\cup B.\)

Exercise 2.14

Let X be a metric space and \(A\subset X.\) Prove that A is bounded if and only if the diameter of A is finite, i.e. \(\sup \{d(x,y):x,y\in A\}<\infty .\)

Exercise 2.15

Let (Xd) be a metric space and \(\rho \) be a pseudo-metric on X. Prove that \(d+\rho \) is a metric on X.

Exercise 2.16

Let X be a nonempty set and \(\rho _1,\dots ,\rho _n\) be (pseudo-)metrics on X. Prove that \(\rho _1+\dots +\rho _n\) is also a (pseudo-)metric on X.

Exercise 2.17

Let d be a pseudo-metric on a space X. Define a relation \(\sim \) on X as

$$x\sim y \text{ if } \text{ and } \text{ only } \text{ if } d(x,y)=0.$$

Prove that \(\sim \) is an equivalence relation on X. For each \(x\in X,\) let [x] denote the equivalence class of x under this relation and \(X^*:=\{[x]:x\in X\}.\) Prove that \(d^*\) is a metric on \(X^*,\) where \(d^*([x],[y]):=d(x,y)\text { for all }[x],[y]\in X^*.\)

Exercise 2.18

Let (Xd) be a metric space. For every \(x,y\in X,\) define

$$\rho _1(x,y):=\min \{1,d(x,y)\} \text{ and } \rho _2(x,y):=\frac{d(x,y)}{1+d(x,y)}.$$

Prove that both \(\rho _1\) and \(\rho _2\) are metrics on X. Further show that every subset of X is bounded in \((X,\rho _1)\) as well as in \((X,\rho _2).\)

Exercise 2.19

Prove the Cauchy-Schwarz inequality in \(\mathbb {R}^2,\) as follows:

  1. (a)

    Let \(a\ge 0\) and \(p(t):=at^2+bt+c.\) If \(p(t)\ge 0\text { for all }t\in \mathbb {R},\) prove that \(b^2\le 4ac.\)

  2. (b)

    Let \((x_1,x_2), (y_1,y_2)\in \mathbb {R}^2.\) Applying (a) with \(p(t):=(tx_1+y_1)^2+(tx_2+y_2)^2,\) prove that

    $$|x_1y_1+x_2y_2|\le \sqrt{x_1^2+x_2^2}\sqrt{y_1^2+y_2^2}.$$

Exercise 2.20

Let X denote the family of real valued functions on the interval [0, 1] and \(d(f,g):=\sup \big \{|f(x)-g(x)|:x\in [0,1]\big \}\text { for all }f,g\in X.\) Prove that d is a metric on X.

Exercise 2.21

Let (Xd) be as in Exercise 2.20. If \(f\in X\) and \(r>0,\) prove that B(fr) is the family of all those functions in X whose graphs lie in a band of width r about the graph of f.

Exercise 2.22

In \((\mathbb {R}^n,d_\infty ),\) prove that the open balls look like hypercubes. In other words, \(B(x;r)=(x_1-r, x_1+r)\times \dots \times (x_n-r, x_n+r)\text { for all }x:=(x_1,\dots ,x_n)\in \mathbb {R}^n\) and \(r\ge 0.\)

Exercise 2.23

(Post office metric) Let \(p\in \mathbb {R}^2\) be a fixed point and \(d_2\) be the Euclidean metric on \(\mathbb {R}^2.\) Prove that d defines a metric on \(\mathbb {R}^2,\) where

$$d(a,b):=d_2(a,p)+d_2(p,b)\text { for all }a,b\in \mathbb {R}^2.$$

Exercise 2.24

Let \((X_1,\rho _1),\dots , (X_n,\rho _n)\) denote a finite family of metric spaces.  For every \(x=(x_1,\dots ,x_n), y=(y_1,\dots ,y_n)\in \prod _{i=1}^n X_i,\) define

$$\rho (x,y) := \sqrt{\rho ^2_1(x_1,y_1)+ \dots + \rho ^2_n(x_n,y_n)}.$$

Prove that \(\rho \) is a metric on the Cartesian product \( \prod _{i=1}^n X_i.\)

Exercise 2.25

Let d be a metric on \(\mathbb {R}^n\) and \((X_1,\rho _1),\dots , (X_n,\rho _n)\) be any finitely many metric spaces. For any \(x=(x_1,\dots ,x_n), y=(y_1,\dots ,y_n)\in \prod _{i=1}^n X_i,\) define

$$\rho _d(x,y) :=d\big (\rho _1(x_1,y_1), \dots , \rho _n(x_n,y_n),(0,\dots ,0)\big ).$$

Prove that \(\rho _d\) is a metric on the Cartesian product \( \prod _{i=1}^n X_i.\)

Exercise 2.26

Let \(\{(X_n,d_n):n\in \mathbb {N}\}\) be a collection of metric spaces such that \(d_n\le 1\text { for all }n\in \mathbb {N}.\)  Let X denote the Cartesian product \(\prod _{n=1}^\infty X_n,\) that is, the family of sequences \(\{x_n\}\) such that \(x_n\in X_n\text { for all }n\in \mathbb {N}.\) For every \(x=\{x_n\}, y=\{y_n\}\in X,\) define

$$\begin{aligned} \rho (x,y) := \sup \bigg \{\frac{d_n(x_n,y_n)}{n} : n\in \mathbb {N}\bigg \} \text { and }\eta (x,y) := \sum _{n=1}^\infty \frac{d_n(x_n,y_n)}{2^n}. \end{aligned}$$

Prove that both \(\rho \) and \(\eta \) are metrics on X.

Exercise 2.27

Let \(\{(X_n,d_n):n\in \mathbb {N}\}\) be a collection of metric spaces and \(X:=\prod _{n=1}^\infty X_n.\)  For any \(x=\{x_n\}, y=\{y_n\}\in X,\) define

$$d(x,y) := \sum _{n=1}^\infty \frac{1}{2^n}.\frac{d_n(x_n,y_n)}{1+d_n(x_n,y_n)}.$$

Prove that d is a metric on X. Also provide three other metrics on X.

Exercise 2.28

Let \(n\in \mathbb {N}\cup \{0\}, X\) be the set of polynomials with degree less than or equal to n and \(p^{(i)}\) be the \(i^{th}\) derivative of \(p\text { for every }p\in X.\) For each \(k\in \mathbb {N},\) define

$$d_k(p,q):=\max \{|p^{(i)}(0)-q^{(i)}(0)|:1\le i< k\}\text { for all }p,q\in X.$$

Obtain a necessary and sufficient condition in terms of k and n such that \(d_k\) is a metric on X.

Exercise 2.29

Let (Xd) be a metric space. For every \(x\in X,\) define a map \(\delta _x:X\longrightarrow \mathbb {R}\) as \(\delta _x(y):=d(x,y)\text { for all }y\in X.\) Let \(\delta (X):=\{\delta _x:x\in X\}.\) Prove that the map \(x\longrightarrow \delta _x\) is a bijection between X and \(\delta (X).\)

Exercise 2.30

(p-adic metric)Fix a prime number p. Let \(x,y\in \mathbb {Q}\) be arbitrary. If \(x=y,\) define \(d(x,y):=0.\) Otherwise, write \(x-y=p^ka/b,\) where \(a,k\in \mathbb {Z}\) and \(b\in \mathbb {N}\) such that p does not divide ab;  and define \(d(x,y):=p^{-k}.\) Prove that d is an ultrametric on \(\mathbb {Q}.\)

Exercise 2.31

Let \(\mathcal {I}\) denote the collection of closed bounded intervals. Define

$$d\big ([a,b],[c,d]\big ):=\max \big \{|a-c|,|b-d|\big \}\text { for all }[a,b],[c,d]\in \mathcal {I}.$$

Prove that d is a metric on \(\mathcal {I}.\)

Exercise 2.32

Does there exist a metric on the space of extended reals \(\mathbb {R}\cup \{-\infty , +\infty \},\) which extends the usual metric on \(\mathbb {R}?\)

Exercise 2.33

Let \(\infty \) denote the (unique) infinity for the set of complex numbers and \(\mathbb {C}_{\infty }:=\mathbb {C}\cup \{\infty \}.\) Is there a metric on \(\mathbb {C}_{\infty },\) that extends the usual metric on \(\mathbb {C}?\)

Exercise 2.34

Let (Xd) be a metric space and \(y\notin X.\) Does there always exist a metric on \(X\cup \{y\},\) which extends the metric d?

Exercise 2.35

Does there exist a metric space with two closed balls \(B_1\) and \(B_2 \) of radii \(r_1\) and \(r_2,\) respectively, such that \(B_1\subset B_2\) and \(r_1>r_2?\)

2.2 Convergence in Metric Spaces

Analogous to the case of \(\mathbb {R},\) the notions of convergent sequences and Cauchy sequences, in general metric spaces, are defined as follows:

Definition 2.20

A sequence \(\{x_n\}\) in a metric space (Xd) is said to be convergent in X if there exists some \(x_0\in X\) satisfying the following condition:

for every  \(\epsilon >0,\)  there exists some  \(N\in \mathbb {N}\) such that \(d(x_n,x_0)<\epsilon \text { for all }n\ge N.\)

In this case, we say that \(\{x_n\}\) converges to \(x_0\) and write \(x_n\longrightarrow x_0.\)  We also call \(x_0\) as the limit of \(\{x_n\}\) and write \(x_0=\lim _{n\rightarrow \infty }x_n.\)

Definition 2.21

If \(x\in X, \) a subset U of X is said to be a neighborhood of x if

$$U\supset B(x;\delta )\text { for some }\delta >0.$$

It is immediate that \(x_n\longrightarrow x\) if and only if every neighborhood of x contains all but finitely many terms of \(\{x_n\}.\)

Definition 2.22

A sequence \(\{x_n\}\) in a metric space (Xd) is said to be Cauchy  if for every \(\epsilon >0,\) there exists some \(N\in \mathbb {N}\) such that

$$d(x_{n},x_{m})<\epsilon \text { for all }n,m\ge N.$$

Subsequences of a sequence in any space are defined naturally, as in Definition 1.12. Various results on metric spaces can be proven analogously to the case of \(\mathbb {R}.\) Here we present some sample cases. Several other analogous results will be provided in Exercise 2.36.

Theorem 2.23

In metric spaces, convergent sequences have unique limits. 

Proof

If possible, let \(\{x_n\}\) be a convergent sequence in a metric space (Xd) with limits \(x'\) and \(x''\) such that \(x'\ne x''.\) Let \(\epsilon = {d(x',x'')}/{2}.\) Then \(\epsilon >0,\) as \(x'\ne x''.\) Since \(\{x_n\}\) converges to \( x'\) and \(x'',\) there are positive integers \(N'\) and \(N''\) such that

$$\begin{aligned} d(x_{n},x')&<\frac{\epsilon }{2}\text { for all }n \ge N' \\ \text { and }d(x_{n},x'')&<\frac{\epsilon }{2}\text { for all }n \ge N''. \end{aligned}$$

Let \(N:=\max \{N',N''\}.\) Then for all \(n\ge N,\) we obtain

$$\begin{aligned} d(x',x'')\le d(x',x_n)+d(x_n,x'') <\frac{\epsilon }{2}+\frac{\epsilon }{2}=\epsilon =d(x',x''), \end{aligned}$$

which is absurd. This completes the proof.    \(\square \)

Analogous to the case of reals, in any metric space, a Cauchy sequence is convergent if it has a convergent subsequence.

Theorem 2.24

Let \(\{x_n\}\) be a Cauchy sequence in a metric space \((X,d), x\in X\) and \(\{x_{n_k}\}\) be a subsequence of \(\{x_n\}\) such that \( \lim _{k\rightarrow \infty } x_{n_k}=x.\) Then \(x_n\longrightarrow x.\)

Proof

Imitating the proof of Proposition 1.27, for every \(\epsilon >0,\) there exist some \(N, K\in \mathbb {N}\) such that

$$\begin{aligned} d(x_{n},x_{m})<& \frac{\epsilon }{2}\text { for all }n, m \ge N \\ \text { and }d(x_{n_k},x)<& \frac{\epsilon }{2} \text { for all }k \ge K. \end{aligned}$$

Let \(p\in \mathbb {N}\) such that \(n_p\ge \max \{N,n_{K}\}.\) Then for all \(n\ge n_{p},\) we have

$$\begin{aligned} d(x_{n},x)\le d(x_{n},x_{n_p})+ d(x_{n_p},x)<\frac{\epsilon }{2}+\frac{\epsilon }{2}=\epsilon . \end{aligned}$$

Hence \(\{x_n\}\) converges to x.    \(\square \)

Now we discuss convergence in Euclidean spaces. Note that the m-dimensional Euclidean space \(\mathbb {R}^m\) has a natural bijection with the collection of functions from \(\{1,\dots , m\}\) into \(\mathbb {R}.\) Motivated by this and for the sake of convenience, we write \(x:=\big (x(1),\dots ,x(m)\big )\text { for every }x\in \mathbb {R}^m.\)

Theorem 2.25

Let \(\{x_n\}\) be a sequence in \(\mathbb {R}^m\) and \(x_0\in \mathbb {R}^m\) such that 

$$ x_n:=\big (x_{n}(1),\dots ,x_{n}(m)\big )\text { for all }n\in \mathbb {N}\cup \{0\}.$$

Then

  1. (a)

    \(x_n\longrightarrow x_0\) if and only if \(x_{n}(j)\longrightarrow x_{0}(j)\text { for every }j=1,\dots ,m.\)

  2. (b)

    \(\{x_n\}\) is Cauchy if and only if \(\{x_{n}(j)\}\) is Cauchy, for every \(j=1,\dots ,m.\)

Proof

We shall prove the first part. The second one is similar. Note that for every \((a(1),\dots , a(m))\in \mathbb {R}^m\) and \(j=1,\dots , m,\) we have

$$ |a(j)|\le \sqrt{\sum _{k=1}^m |a(k)|^2} \le \sum _{k=1}^m |a(k)|.$$

Hence for every \( j=1,\dots ,m\) and for every \(n\in \mathbb {N},\) we have

$$ \big |x_{n}(j)- x_{0}(j)\big |\le d_2(x_n,x_0)\le \sum _{k=1}^m \big |x_{n}(k)- x_{0}(k)\big |.$$

Let \(\epsilon >0\) be given. If \(x_n\longrightarrow x_0,\) there exists some \(N\in \mathbb {N}\) such that \(d_2(x_n,x_0)<\epsilon \text { for all }n\ge N.\) Hence for every \(j=1,\dots ,m\) and for every \(n\ge N,\) we obtain

$$ \big |x_{n}(j)- x_{0}(j)\big |\le d_2(x_n,x_0)<\epsilon .$$

This proves that every \(x_{n}(j)\longrightarrow x_{0}(j)\text { for every }j=1,\dots ,m.\)

Conversely, if \(x_{n}(j)\longrightarrow x_{0}(j),\) for all \( j=1,\dots ,m,\) there exist \(N_j\in \mathbb {N}\) such that

$$|x_{n}(j)- x_{0}(j)|<\frac{\epsilon }{m}\text { for all }n\ge N_j.$$

Let \(N_0:=\max \{N_1,\dots , N_m\}.\) Then for every \(n\ge N_0,\) we obtain

$$ d_2(x_n,x_0)\le \sum _{j=1}^m \big |x_{n}(j)- x_{0}(j)\big |<\sum _{j=1}^m \frac{\epsilon }{m}=\epsilon .$$

Hence \(\{x_n\}\) is convergent to \(x_0.\)    \(\square \)

Theorem 2.26

Every Cauchy sequence in \(\mathbb {R}^m\) is convergent in \(\mathbb {R}^m.\)

Proof

Let \(\{x_n\}\) be a Cauchy sequence in \(\mathbb {R}^m.\) Applying Theorem 2.25, \(\{x_{n}(j)\}\) is also Cauchy, for every \(j=1,\dots ,m.\) Now Theorems 1.28 and 2.25 ensure that \(\{x_n\}\) is convergent in \(\mathbb {R}^m.\)    \(\square \)

Now we generalize the Bolzano-Weierstrass property, already proved for sequences of real numbers in Theorem 1.22.

Theorem 2.27

(Bolzano-Weierstrass) Every bounded sequence in \(\mathbb {R}^m\) contains a subsequence that converges in \(\mathbb {R}^m.\) 

Proof

Let \(\{x_n\}\) be a bounded sequence in \(\mathbb {R}^m.\) Write \(x_n:=\big (x_n(1),\dots ,x_n(m)\big )\) for all \(n\in \mathbb {N}.\) As \(\{x_n\}\) is bounded and \(|x_n(j)|\le d_2(x_n,0),\) for all j,  the sequence \(\{x_n(j)\}\) is bounded.

Since \(\{x_n(1)\}\) is a bounded sequence of reals, by Theorem 1.22, it has a convergent subsequence. Let \(\{x_{n_{k_1}}(1)\}\) be that subsequence and x(1) be its limit. This gives us a subsequence \(\{x_{n_{k_1}}\},\) of the original sequence. As earlier obtain a subsequence \(\{x_{n_{k_2}}(2)\}\) of \(\{x_{n_{k_1}}(2)\},\) which is convergent to some real x(2). Continuing like this m-times, we obtain a subsequence \(\{x_{n_{k_m}}\}\) of \(\{x_n\}\) such that \(x_{n_{k_m}}(j)\longrightarrow x(j)\text { for every }j=1,\dots , m.\)

Let \(x:=\big (x(1),\dots ,x(m)\big ).\) Then \(x \in \mathbb {R}^m\) and by Theorem 2.25, we conclude that   \(x_{n_{k_m}}\longrightarrow x.\) Hence the result.   \(\square \)

Exercise 2.36

In any metric space, prove that the following assertions hold:

  1. (a)

    Every convergent sequence is bounded.

  2. (b)

    Every convergent sequence is Cauchy.

  3. (c)

    Every Cauchy sequence is bounded.

  4. (d)

    Every subsequence of a Cauchy sequence is also a Cauchy sequence.

  5. (e)

    All subsequences of a convergent sequence are convergent to the same limit.

  6. (f)

    Removing (inserting) any finite number of terms anywhere from (in) a sequence does not affect its convergence.

Exercise 2.37

In Exercise 2.36, show that the converse statements of (a), (b), and (c) are not true, in general.

Exercise 2.38

Let \(\{x_n\}\) be a sequence in a metric space (Xd) and \(x\in X.\) Prove that the following are equivalent:

  1. (a)

    \(x_n\longrightarrow x,\)

  2. (b)

    \(d(x_n,x)\longrightarrow 0,\) and

  3. (c)

    For every neighborhood U of x,  there exists some a positive integer \(N_U\) such that \(x_n\in U \text { for all }n>N_U.\)

Exercise 2.39

Characterize convergent sequences in discrete metric spaces.

Exercise 2.40

Let \(a_n \longrightarrow a\) and \(b_n \longrightarrow b,\) in a Euclidean space \(\mathbb {R}^m.\) Prove that

  1. (a)

    \(\{ka_n\}\longrightarrow a, \) for all scalars \(k\in \mathbb {R}.\)

  2. (b)

    \(\{a_n+ b_n\}\longrightarrow a+b,\) and

  3. (c)

    \(\{a_n\cdot b_n\}\longrightarrow a\cdot b,\) here \(x\cdot y\) represents the dot product of \(x,y\in \mathbb {R}^m.\)

Exercise 2.41

Let \(a_n \longrightarrow a\) in \(\mathbb {R}^m\) and \(b_n \longrightarrow b\) in \(\mathbb {R}.\) Prove that \(a_nb_n\longrightarrow ab.\)

Exercise 2.42

Let \(a_n \longrightarrow 0\) in \(\mathbb {R}^m\) and \(\{b_n\}\) be a bounded sequence of real numbers. Prove that \(a_nb_n\longrightarrow 0.\)

Exercise 2.43

Let \(a_n \longrightarrow a\) in \(\mathbb {R}^m.\) Prove that \(|a_n|\longrightarrow |a|.\) Is the converse true?

Exercise 2.44

Let \(\{a_n\}\) and \(\{b_n\}\) be two Cauchy sequences in \(\mathbb {R}^m.\) Prove that

  1. (a)

    \(\{ka_n\}\) is a Cauchy sequence, for all scalars \(k\in \mathbb {R}.\)

  2. (b)

    \(\{a_n+b_n\}\) is a Cauchy sequence.

Exercise 2.45

In discrete metric spaces, prove that

  1. (a)

    convergent sequences are eventually constant, 

  2. (b)

    Cauchy sequences are eventually constant, and

  3. (c)

    Cauchy sequences are convergent.

Exercise 2.46

Write a proof for the second part of Theorem 2.25.

Exercise 2.47

Write an alternate proof of Theorem 2.26, using Exercise 2.36(c), Theorems 2.27 and 2.24.

Exercise 2.48

Let X be a metric space, \(x\in X\) and \(\{x_n\}\) be a sequence in X. If every subsequence of \(\{x_n\}\) has a subsequence convergent to x,  prove that \(x_n\longrightarrow x.\)

Exercise 2.49

Let X be a metric space containing two points x and y. If \(x_n\longrightarrow x\) and \(y_n\longrightarrow y\) in X,  then prove that the set \(\{x_n:n\in \mathbb {N}\}\cap \{y_n:n\in \mathbb {N}\}\) is finite.

2.3 Normed Linear Spaces

The notion of metric spaces generalizes the space of real numbers, by extending the distance function. Now we discuss normed linear spaces, which also extend the addition and scalar multiplication operations from finite-dimensional Euclidean spaces, along with the distance.

We assume that the reader is familiar with the notion of vector spaces. A few subsequent results will require the notions of algebraic basis and subspace of a vector space. All vector spaces in this book will be considered over the scalar fields \(\mathbb {R}\) or \(\mathbb {C}.\)

Let X be a linear (vector) space over a field \(\mathbb {R}\) or \(\mathbb {C}.\) A function \(\Vert .\Vert :X\longrightarrow [0,\infty )\) is said to be anorm on X if for every \(x,y\in X\) and for every scalar k,  it satisfies the following conditions:

  1. (a)

    \(\Vert x\Vert \ge 0\) (\(\Vert .\Vert \) is positive)

  2. (b)

    \(\Vert x\Vert =0\) if and only if \(x=0\) (\(\Vert .\Vert \) is definite)

  3. (c)

    \(\Vert kx\Vert =|k| \Vert x\Vert \)(\(\Vert .\Vert \) is homogeneous)

  4. (d)

    \(\Vert x+y\Vert \le \Vert x\Vert +\Vert y\Vert .\)(\(\Vert .\Vert \) satisfies the triangle inequality)

In this case, we say that \((X,\Vert .\Vert )\) is a normed linear space or simply a normed space. If there is no ambiguity on the norm, we simply write X for \((X,\Vert .\Vert ).\)

Note that every norm \(\Vert .\Vert \) on a linear space X induces a metric given by

$$d(x,y):=\Vert x-y\Vert \text { for all }x,y \in X.$$

Therefore every normed linear space is a metric space.

Examples 2.28

  1. (a)

    If \(X=\mathbb {R},\) then \(x\longmapsto |x|\) defines a norm on X.

  2. (b)

    Let \(n\in \mathbb {N}\) and \(X=\mathbb {R}^n.\) For each \(x=(x_1,\dots ,x_n)\in X,\) define

    $$\Vert x\Vert _2:=\sqrt{x_1^2+\dots +x_n^2}.$$

    By Corollary 2.10, one can conclude that \((X,\Vert .\Vert _2)\) is a normed linear space.

  3. (c)

    Let \(C[a,b]\) denote the space of continuous real valued functions on a closed bounded interval \([a,b].\) Then

    $$\Vert f \Vert :=\sup \big \{|f(x)|:x\in X\big \}\text { for all }f\in C[a,b].$$

    defines a norm on \(C[a,b],\) known as the uniform norm or the supremum norm.

  4. (d)

    If Y is a linear subspace of a normed linear space \((X,\Vert .\Vert ),\) then \((Y,\Vert .\Vert )\) is also a normed linear space.

Remark 0.29 In general, a subspace of a metric space (Xd) is a nonempty subset Y of X,  equipped with the same metric d. However, in case of normed linear spaces X,  the term subspace is reserved only for linear subspaces of X.

Proposition 2.30

Let \(\ell ^2\) denote the collection of sequences \(\{x_n\}\) of real numbers such that \(\sum _{n=1}^\infty |x_n|^2<\infty .\)  Define

$$\Vert x\Vert _2:=\sqrt{\sum _{n=1}^\infty |x_n|^2}\text { for all }x=\{x_n\}\in \ell ^2.$$

Then \((\ell ^2,\Vert .\Vert _2)\) is a normed linear space.

Proof

It is easy to see that the function \(\Vert .\Vert _2\) satisfies the first two requirements of a norm. To prove the triangle inequality, let \(x:=\{x_n\}\) and \(y:=\{y_n\}\) be any two elements of \(\ell ^2.\) Applying Corollary 2.10, for every \(n\in \mathbb {N},\) we obtain

$$\begin{aligned} \big (\sum _{k=1}^n |x_k+y_k|^2\big )^{\frac{1}{2}} &\le \big (\sum _{k=1}^n |x_k|^2\big )^{\frac{1}{2}}+\big (\sum _{k=1}^n |y_k|^2\big )^{\frac{1}{2}}\\ &\le \big (\sum _{k=1}^\infty |x_k|^2\big )^{\frac{1}{2}}+\big (\sum _{k=1}^\infty |y_k|^2\big )^{\frac{1}{2}}=\Vert x\Vert _2+\Vert y\Vert _2 \end{aligned}$$

Passing limit \(n\longrightarrow \infty ,\) we obtain \(\Vert x+y\Vert _2\le \Vert x\Vert _2+\Vert y\Vert _2.\) Hence the result.   \(\square \)

Above we have generalized Minkowski’s inequality, given by Corollary 2.10, to the space \(\ell ^2.\) Similarly, one can generalize the Cauchy-Schwarz inequality (2.9).  Next we discuss a particular class of normed spaces, known as the inner product spaces.

Definition 2.31

Let X be a linear space over a field \(\mathbb {K}\) (either \(\mathbb {R}\) or \(\mathbb {C}\)). An inner product on X is a mapping \(\langle ., .\rangle :X\times X\longrightarrow \mathbb {K}\) such that for all \(x, y, z \in X \) and \(\alpha , \beta \in \mathbb {K}\) we have

  1. (a)

    \(\langle x,x\rangle \ge 0,\) and \(\langle x,x\rangle =0\) if and only if \( x=0.\)(positive definiteness)

  2. (b)

    \(\langle \alpha x+\beta y,z\rangle =\alpha \langle x,z\rangle +\beta \langle y,z\rangle \) (linearity in the first variable)

  3. (c)

    \(\langle x,y\rangle =\overline{\langle y,x\rangle }\)(conjugate linearity in the second variable)

In this case, \((X,\langle . \rangle )\) is known as an inner product space.

Examples 2.32

  1. (a)

    The standard dot product on \(\mathbb {R}^n\) is an inner product.

  2. (b)

    If \(X:=c_{00},\) then \(\big \langle \{x_n\}, \{y_n\}\big \rangle :=\sum _{n=1}^\infty x_n\overline{y_n}\) defines an inner product on X.

Theorem 2.33

Let \((X,\langle , \rangle )\) be an inner product space over \(\mathbb {K},\) and \(x,y\in X.\) Then the following hold:

  1. (a)

    Cauchy-Schwarz inequality:\( |\langle x,y\rangle |\le \langle x,x\rangle \langle y,y\rangle ,\) and the equality holds here if and only if x and y are linearly dependent.

  2. (b)

    \( \Vert x \Vert :=\sqrt{ \langle x,x\rangle }\) defines a norm on X.

  3. (c)

    Parallelogram law: \( \Vert x +y \Vert ^2+ \Vert x-y \Vert ^2=2( \Vert x \Vert ^2+ \Vert y \Vert ^2).\)

  4. (d)

    Polarization identity:

    $$\begin{aligned} \langle x,y\rangle = \left\{ \begin{array}{ll} \frac{1}{4}\big ( \Vert x+y \Vert ^2- \Vert x-y \Vert ^2 \big )&{} \text{ if } \mathbb {K}=\mathbb {R}\\ \frac{1}{4}\big ( \Vert x+y \Vert ^2- \Vert x-y \Vert ^2 +i \Vert x+iy \Vert ^2-i \Vert x-iy \Vert ^2\big )&{} \text{ if } \mathbb {K}=\mathbb {C}. \end{array} \right. \end{aligned}$$
    (2.5)

Proof

With \({z:=\langle y,y\rangle x-\langle x,y\rangle y},\) (a) can be established analogous to Theorem 2.9. Further, (c) and (d) are routine manipulations. Here we we prove (b) only.

The positive definiteness and homogeneity are immediate. For the triangle inequality, note that the inequality in (a) translates to \(|\langle x,y\rangle |\le \Vert x \Vert \Vert y \Vert .\) Hence

$$\begin{aligned} \Vert x+y \Vert ^2=\langle x+y,x+y\rangle & = \Vert x \Vert ^2+2 Re(\langle x,y\rangle )+ \Vert y \Vert ^2 \le \Vert x \Vert ^2+2 |\langle x,y\rangle |+ \Vert y \Vert ^2\\ \le \Vert x \Vert ^{2} & +2 \Vert x \Vert \Vert y \Vert + \Vert y \Vert ^2 =( \Vert x \Vert + \Vert y \Vert )^2. \end{aligned}$$

This proves that \( \Vert x + y \Vert \le \Vert x \Vert + \Vert y \Vert .\)    \(\square \)

Remarks 2.34

In 1935, Jordan-Von Neumann established that if a normed space satisfies the parallelogram law, then its norm is induced by an inner product. In that case, the inner product is given by the polarization identity (2.5). There are 350 characterizations of inner product spaces in the book of Dan Amir, see [2]. For more on inner product spaces, the reader is referred to [3, Chap. VI].

Exercise 2.50

Let X be a normed space, \(x,y\in X,\) and \(\alpha \) be a scalar. Prove that

$$\begin{aligned} \big | \Vert x\Vert -\Vert y\Vert \big |\le \Vert x-y\Vert \text { and }\Vert \alpha x-\alpha y\Vert =|\alpha |\Vert x-y\Vert . \end{aligned}$$

Exercise 2.51

Which vector subspaces of a normed space are bounded subsets?

Exercise 2.52

Let \(c_{00}\) be the set of sequences of reals which are eventually zero, that is, real sequences \(\{x_n\}\) such that \(x_n=0\) for all sufficiently large n. Define

$$\Vert \{x_n\}\Vert _2:=\sqrt{\sum _{n=1}^\infty |x_n|^2}\text { for all }\{x_n\}\in c_{00}.$$

Prove that \((c_{00}, \Vert .\Vert _2)\) is a normed linear space.

Exercise 2.53

Let \(n\in \mathbb {N}\) and \( p\in [1,\infty ].\) For every \(x=(x_1,\dots ,x_n)\in \mathbb {R}^n,\) define

$$ \Vert x\Vert _p:=\left\{ \begin{array}{lll} &{} (\sum _{i=1}^n |x_i|^p)^{\frac{1}{p}} &{}; 1\le p<\infty ,\\ &{} \sup \{|x_1|,\dots ,|x_n|\} &{}; p=\infty . \end{array}\right. $$

Prove that \(\Vert .\Vert _p\) defines a norm on the linear space \(\mathbb {R}^n\) over \(\mathbb {R}.\)

Exercise 2.54

If \(x=\{x_k\}, y=\{y_k\}\in \ell ^2,\) prove that \(\sum _{k=1}^\infty |x_ky_k| \le \Vert x\Vert _2\Vert y\Vert _2.\)

Exercise 2.55

Write a proof for the parallelogram law and the polarization identity as given in Theorem 2.33.

Exercise 2.56

Let X be a normed space, \(y\in Y\subset X, x\in X\) and \(\alpha \) be a scalar. If \(dist(x;Y):=\inf \{d(x,y):y\in Y\},\) prove that \(\Vert kx+y\Vert \ge |\alpha |\times dist(x;Y).\)

Exercise 2.57

Is there any linear space on which the discrete metric can be induced by a norm?

Exercise 2.58

Show that the metric induced by any norm, on a linear space, is translation invariant.

Exercise 2.59

Is it possible to assign a norm to every linear space over \(\mathbb {C} ?\)

Exercise 2.60

Let d be a translation invariant and homogeneous metric on a vector space X,  and \(\Vert x\Vert :=d(x,0)\text { for all }x\in X.\) Prove that \((X,\Vert .\Vert )\) is a normed space and induces metric d.

Exercise 2.61

Let X be a linear space as well as a metric space. Under what conditions it becomes a normed linear space having topology same as the one given by the metric?

2.4 Sequence Spaces

Let \(\mathbb {K}\) be any of \(\mathbb {R}\) or \(\mathbb {C}.\) We start with the following vector spaces over \(\mathbb {K}.\) 

$$\begin{aligned} c_{00}:=& \text{ the } \text{ space } \text{ of } \text{ all } \text{ sequences } \text{ over } \,\mathbb {K}\, \text{ with } \text{ only } \text{ finitely } \text{ many } \text{ non-zero } \text{ terms. }\\ c_{0}:=& \text{ the } \text{ space } \text{ of } \text{ all } \text{ sequences } \text{ over }\, \mathbb {K},\, \text{ convergent } \text{ to }\, 0.\\ c:=& \text{ the } \text{ space } \text{ of } \text{ all } \text{ convergent } \text{ sequences } \text{ over }\, \mathbb {K}. \end{aligned}$$

Let \(1\le p\le \infty .\) For a sequence \(x=\{x_j\}\) over \(\mathbb {K},\) define extended real numbers \(\Vert x\Vert _p\) as follows:

$$\begin{aligned} \Vert x\Vert _p:=\left\{ \begin{array}{lll} &{} \big {(}\sum _{j=1}^\infty |x_j|^p \big {)}^{\frac{1}{p}} &{}; 1\le p<\infty ,\\ &{} \sup \{|x_j|:j\in \mathbb {N}\} &{}; p=\infty . \end{array}\right. \end{aligned}$$

For every \(1\le p\le \infty ,\) let \(\ell ^p\) denote the collection of all sequences x over \(\mathbb {K}\) with \(\Vert x \Vert _p <\infty .\) It is easy to see that \(c_{00}, c_0\) and c are vector spaces over \(\mathbb {K}.\) The same is true for \(\ell ^p (1\le p\le \infty ).\)

Theorem 2.35

\(\ell ^p\) is a linear space, for all \(1\le p\le \infty .\)

Proof

It is evident that each \(\ell ^p\) is closed under scalar multiplication. Let \(p\in [1,+\infty ]\) and \(x,y\in \ell ^p.\) We shall now establish that \(x+y\in \ell ^p.\) Write \(x=\{x_n\}\) and \(y=\{y_n\}.\)

First consider the case when \(p=\infty .\) By triangle inequality \(|x_n+y_n|\le |x_n|+|y_n|\le \Vert x\Vert _\infty +\Vert y\Vert _\infty \text { for all }n\in \mathbb {N}.\) Therefore, \(\Vert x+y\Vert _\infty \le \Vert x\Vert _\infty +\Vert y\Vert _\infty <\infty \) and hence \(x+y\in \ell ^\infty .\)

Now suppose that \(1\le p<\infty .\) Let \(z_n:=\max \{|x_n|,|y_n|\}\text { for all }n\in \mathbb {N}.\) Note that \(|z_n|^p\le |x_n|^p+|y_n|^p\text { which implies }\Vert z\Vert _p^p\le \Vert x\Vert _p^p+\Vert y\Vert _p^p.\) Hence \(z=\{z_n\}\in \ell ^p.\) Further note that

$$|x_n+y_n|^p\le \big ||x_n|+|y_n|\big |^p\le (2|z_n|)^p=2^p|z_n|^p.$$

Summing \(\sum _{n=1}^\infty ,\) we obtain \(\Vert x+y\Vert _p^p\le 2^p\Vert z\Vert ^p_p<\infty .\) Thus \(x+y\in \ell ^p.\)    \(\square \)

Now we claim that \(\Vert .\Vert _p\) is a norm on the linear space \(\ell ^p\text { for every }1\le p\le \infty .\) It is easy to see that \(\Vert .\Vert \) is positive definite and homogeneous. If \(p=1\) or \(\infty ,\) then the triangle inequality follows immediately from the definition of \(\Vert .\Vert _p.\) We shall establish this inequality for the case \(1<p<\infty \) soon, which needs some further results. Before that, let us discuss the inclusion relations among sequence spaces.

Theorem 2.36

(Jensen’s inequality) Let \(1\le a<b\le \infty .\) If \(x\in \ell ^a,\) then \(\Vert x\Vert _b\le \Vert x\Vert _a.\)  Consequently \(\ell ^a\subset \ell ^b.\)

Proof

The consequence is immediate from the inequality. Also for \(b=\infty ,\) the result follows from the definition of \(\Vert .\Vert _\infty .\) Suppose \(b<\infty \) and write \(x:=\{x_n\}.\)

First assume that \(\Vert x\Vert _a\le 1.\) Then for every \(n\in \mathbb {N},\) we have \(|x_n|\le 1,\) which implies that \(|x_n|^b\le |x_n|^a.\) Hence \(\Vert x\Vert _b^b\le \sum _{n=1}^\infty |x_n|^a=\Vert x\Vert _a^a.\)

Now for any \(x\in \ell ^a,\) applying the above calculations by replacing x with \( {x}/{\Vert x\Vert _a},\) we conclude that

$$\begin{aligned} \bigg \Vert \frac{x}{\Vert x\Vert _a}\bigg \Vert _b^b\le \bigg \Vert \frac{x}{\Vert x\Vert _a}\bigg \Vert _a^a, \end{aligned}$$

and hence \(\Vert x\Vert _b\le \Vert x\Vert _a.\)    \(\square \)

We leave it to the reader to prove that the following chain of inclusion relations holds among sequence spaces, which is proper at every stage:

$$\begin{aligned} c_{00}\subset \ell ^a\subset \ell ^b \subset c_0 \subset c \subset \ell ^\infty \text { for all }1\le a< b < \infty . \end{aligned}$$
(2.6)

To establish the triangle inequality for sequence spaces, we present a set of inequalities.

If \(p,q\in [1,+\infty ]\) satisfy \(\frac{1}{p}+\frac{1}{q}=1,\) then these are known as conjugate exponents of each other.

Theorem 2.37

(Young’s inequality) Let pq be conjugate exponents such that \(p \in (1,\infty ).\) Then 

$$ab\le \frac{a^p}{p}+\frac{b^q}{q}\text { for all }a, b\in [0,\infty ).$$

Moreover, the equality occurs if and only if \(a^{p}=b^q.\)

Proof

The result is trivial if either \(a=0\) or \(b=0.\) Suppose that both a and b are positive real numbers. Also note that

$$p-1=p\bigg (1-\frac{1}{p}\bigg )=\frac{p}{q} \text { and }q-1=\frac{q}{p}=\frac{1}{p-1}.$$

Consider the functions f and g on \((0,\infty ),\) defined as follows:

$$f(t):=t^{p-1}\text { and }g(t):=t^{q-1}\text { for all }t> 0.$$

Since \(p-1\) and \(q-1\) are positive, both f and g are strictly increasing functions from \((0,\infty )\) onto \((0,\infty ).\) It can be shown that these are inverses of each other.

figure a

Let \(a, b\in (0,\infty ).\) Then the area of the rectangle \([0,a]\times [0,b]\) is at least the sum of areas of the regions \(\{(x,x^{p-1}):0\le x\le a\}\) and \(\{(y^{q-1},y):0\le y\le b\}.\) That is

$$ab\le \int _0^a x^{p-1}dx +\int _0^b y^{q-1}dy=\frac{a^p}{p}+\frac{b^q}{q}.$$

Further, the equality occurs here if and only if the area of above rectangle is exactly equal to the sum of areas of those two regions, which is true if and only if \(b=a^{p-1}.\) Now \(b=a^{p-1}\) holds if and only if \(b^q=a^{q(p-1)}=a^p.\) Hence the result.   \(\square \)

Theorem 2.38

(Hölder’s inequality) Let pq be conjugate exponents such that \(1\le p\le \infty , x=\{x_n\}\in \ell ^p\) and \(y=\{y_n\}\in \ell ^q.\) Then  \(\sum _{n=1}^\infty |x_ny_n|\le \Vert x\Vert _p\Vert y\Vert _q.\)

Proof

The result is trivial, if either \(p\in \{1,\infty \}\) or either of \(\Vert x\Vert _p\) or \(\Vert y\Vert _q\) is zero or infinity. Therefore, without loss of generality, we assume that \(1<p<\infty ,\) \(0<\Vert x\Vert _p<\infty \) and \(0<\Vert y\Vert _q<\infty .\) Applying Theorem 2.37, for each \(n\in \mathbb {N},\) we conclude that

$$\frac{|x_ny_n|}{\Vert x\Vert _p\Vert y\Vert _q}\le \frac{1}{p}\bigg (\frac{|x_n|}{\Vert x\Vert _p}\bigg )^p+\frac{1}{q}\bigg (\frac{|y_n|}{\Vert y\Vert _q}\bigg )^q.$$

Passing summation \(\sum _{n=1}^\infty ,\) we obtain

$$\begin{aligned} \frac{1}{\Vert x\Vert _p\Vert y\Vert _q} \sum _{n=1}^\infty |x_ny_n|\le \frac{1}{p}\sum _{n=1}^\infty \bigg (\frac{|x_n|}{\Vert x\Vert _p}\bigg )^p+ \frac{1}{q}\sum _{n=1}^\infty \bigg (\frac{|y_n|}{\Vert y\Vert _q}\bigg )^q =\frac{1}{p}+\frac{1}{q}=1. \end{aligned}$$

Hence we conclude the required inequality.    \(\square \)

For \(p=q=2,\) the Hölder’s inequality is essentially the Cauchy-Schwarz inequality.

Theorem 2.39

(Minkowsky’s inequality) Let \(p\in [1,+\infty ]\) and \(x,y\in \ell ^p.\) Then 

$$\Vert x+y\Vert _p\le \Vert x\Vert _p+\Vert y\Vert _p.$$

Proof

The result is trivial for the cases \(p=1\) and \(p=\infty .\) Also incase \(\Vert x+y\Vert _p=0,\) there is nothing to prove. Suppose \(\Vert x+y\Vert _p>0\) and that \(1<p<+\infty .\) Applying Theorem 2.35, we obtain \(x+y\in \ell ^p.\)The triangle inequality implies

$$\begin{aligned} \sum _{n=1}^\infty |x_n+y_n|^p\le \sum _{n=1}^\infty |x_n+y_n|^{p-1}|x_n|+\sum _{n=1}^\infty |x_n+y_n|^{p-1}|y_n|. \end{aligned}$$
(2.7)

Let q be the conjugate exponent of p. Then \(q(p-1)=p\) and consequently

$$\sum _{n=1}^\infty \big (|x_n+y_n|^{p-1}\big )^q= \sum _{n=1}^\infty |x_n+y_n|^{p} <\infty .$$

For \(r>0\) and a sequence \(a:=\{a_n\}\) of complex numbers, we shall denote the sequence \(\{|a_n|^r\}\) with simply \(|a|^r.\) Therefore \(|x+y|^{p-1}\in \ell ^q.\) Also, we have

$$\big \Vert |x+y|^{p-1} \big \Vert _q =\bigg (\sum _{n=1}^\infty \big (|x_n+y_n|^{p-1}\big )^q\bigg )^{\frac{1}{q}} =\bigg (\sum _{n=1}^\infty |x_n+y_n|^{p}\bigg )^{\frac{1}{q}} =(\Vert x+y\Vert _p)^{\frac{p}{q}} .$$

Applying Hölder’s inequality, we obtain

$$\begin{aligned} \sum _{n=1}^\infty |x_n+y_n|^{p-1}|x_n| &\le \Vert x\Vert _p \big \Vert |x+y|^{p-1} \big \Vert _q = \Vert x\Vert _p (\Vert x+y\Vert _p)^{\frac{p}{q}}\\ \sum _{n=1}^\infty |x_n+y_n|^{p-1}|y_n| &\le \Vert y\Vert _p \big \Vert |x+y|^{p-1} \big \Vert _q = \Vert y\Vert _p (\Vert x+y\Vert _p)^{\frac{p}{q}}. \end{aligned}$$

Using this in (2.7), we obtain

$$\Vert x+y\Vert _p^p\le \big (\Vert x\Vert _p+\Vert y\Vert _p\big ) \big (\Vert x+y\Vert _p\big )^{\frac{p}{q}}.$$

Divide it with \(\big (\Vert x+y\Vert _p\big )^{\frac{p}{q}}\) to conclude the result.    \(\square \)

Remarks 2.40

  1. (a)

    Let \(x_1,\dots ,x_n\) and \(y_1,\dots ,y_n\) be non-negative reals. If the \(\ell ^p\)-norms of \((x_1,\dots ,x_n)\) and \((y_1,\dots ,y_n)\) coincide for n different positive reals p,  then \(x_i\) are just a permutation of \(y_i\) (see [4]).

  2. (b)

    The textbook [5] starts with a chapter on basic inequalities. There is also a complete book on inequalities by Hardy, Littlewood, and Polya (see [6]). An essay on a history of inequalities can be found in [7].

  3. (c)

    We are avoiding an important class of normed spaces called the Lebesgue spaces or the \(L^p\) spaces, as these are beyond the scope of this book (see [1, p. 253] or [9, Chaps. 7-8]).

Exercise 2.62

Prove the Hölder’s and Minkowsky’s inequalities for \(p\in \{1,+\infty \}.\)

Exercise 2.63

If \(1\le a<b\le \infty \) and \(x_n\longrightarrow x\) in \(\ell ^a,\) prove that \(x_n\longrightarrow x\) in \(\ell ^b.\)

Exercise 2.64

Suppose \(1\le a<\infty \) and \(x\in \ell ^a.\) Prove that \(\Vert x\Vert _\infty \le \Vert x\Vert _a.\)

Exercise 2.65

Let X be the space of polynomials over \(\mathbb {C}.\) Establish a linear bijection between X and \(c_{00}.\) Use it to define a norm on X.

Exercise 2.66

Prove the chain of inclusions (2.6) on page xx and show that all these inclusion are strict.

Exercise 2.67

Let pq be conjugate exponents such that \(p \in (1,\infty ).\) Prove that

$$ab\le \frac{1}{p}.\bigg (\frac{a}{c}\bigg )^p +\frac{(bc)^q}{q}\text { for all }a, b, c\in (0,\infty ).$$

Also show that the equality occurs if and only if \(a^{p}=b^q.\)

Exercise 2.68

Applying the Jordan-Von Neumann’s characterization, as in Remarks 2.34, prove that \(\ell ^p\) is an inner product space if and only if \(p=2.\)

Exercise 2.69

If \(\{a_1,\dots ,a_n\}\subset \mathbb {N}\) satisfy \(\sum _{k=1}^n a_k\le 1,\) prove that \(\sum _{k=1}^n \frac{1}{a_k}\ge n^2.\)

Exercise 2.70

Deduce AM-GM inequality from Young’s inequality.

Exercise 2.71

Let \(1\le p<\infty \) and \(x,y\in \ell ^p.\) Assuming the convexity of the function \(t\longmapsto t^p\) on \((0,\infty ),\) provide an alternative proof to the inequality \(\Vert x+y\Vert _p\le \Vert x\Vert _p+\Vert y\Vert _p.\)

Exercise 2.72

If \(x\in \ell ^p\text { for all }p\in (1,\infty ),\) prove that \(\Vert x\Vert _\infty = \lim _{p\rightarrow \infty } \Vert x\Vert _p.\)

Exercise 2.73

In Exercise 2.72, is the hypothesis that \(x\in \ell ^p\text { for all }p\in (1,\infty )\) redundant?

Exercise 2.74

Prove that c is the linear space spanned by \( c_0\bigcup \{ (1,1,1,\dots )\}.\)

2.5 Hints and Solutions to Selected Exercises

  1. 2.7

    For any \(x,y\in X,\) the hypothesis implies \(d(x,y)\le d(x,x)+d(y,x)=d(y,x).\) Similarly, \(d(y,x)\le d(x,y).\) Hence \(d(y,x)=d(x,y).\) If \(d(x,y)<0,\) then

    $$0=d(x,x)\le d(x,y)+d(y,x)=2d(x,y)<0,$$

    a contradiction. Hence the result.

  2. 2.28

    Note that \(p^{(i)}(0)=0\) if and only if the \((i+1)^{th}\) coefficient in p,  starting from the constant term, is zero. Therefore \(d_k\) is a metric on X if and only if \(k\ge n-1.\)

  3. 2.35

    Yes. For example, in \([-1,1]\) under usual metric, we have \(B[-1;2]\subset B[0;1].\)

  4. 2.36

    All these proofs are analogous to the case of \(\mathbb {R}\) (see Theorem 2.23).

  5. 2.41

    Use the fact that if a sequence converges, then it is bounded. If \(|x-y|\) represents the \(d_2(x,y),\) apply the following inequality

    $$\begin{aligned} |a_nb_n-a b |\le |a_nb_n-ab_n |+|ab_n -ab | =|a_n-a||b_n|+|a||b_n-b|.\end{aligned}$$
  6. 2.48

    Suppose that \(\{x_n\}\) is not convergent to x. Then there exists some \(\epsilon >0\) and a subsequence \(\{x_{n_k}\}\) of \(\{x_n\}\) such that \(d(x_{n_k},x)\ge \epsilon \text { for all }k\in \mathbb {N}.\) Therefore, the subsequence \(\{x_{n_k}\}\) of \(\{x_n\}\) has no subsequence convergent to x,  a contradiction.

  7. 2.54

    Use Theorem 2.9 and imitate the proof of Proposition 2.30.

  8. 2.57

    Discrete metric on any linear space doesn’t satisfy the second assertion of Exercise 2.50.

  9. 2.59

    Yes. Let X be any linear space and \(\mathcal {B}\) be a basis of X. Then every \(x\in X\) can be written uniquely as a finite linear combination \(x=\sum _i \alpha _i v_i,\) where \(\alpha _i\in \mathbb {C}\) and \(v_i\in \mathcal {B}.\) Then \(\Vert x\Vert :=\sum _i |\alpha _i |\) defines a norm on X.

  10. 2.61

    See Exercise 2.60.

  11. 2.63

    Apply Theorem 2.36.

  12. 2.66

    Suppose \(1\le a<b<\infty \) and define \(x_n:=n^{-\frac{1}{2} (\frac{1}{a}+\frac{1}{b} )}\text { for all }n\in \mathbb {N}.\) Then

    $$\begin{aligned} |x_n|^a&=n^{-\frac{1}{2} (1+\frac{a}{b} )}>n^{-c}, \text{ where } c\in \bigg (\frac{1}{2}+\frac{a}{2b},1\bigg ).\\ \text { and }|x_n|^b&=n^{-\frac{1}{2} (1+\frac{b}{a} )}<n^{-d}, \text{ where } d\in \bigg (1,\frac{1}{2}+\frac{b}{2a}\bigg ). \end{aligned}$$

    Hence \(\{x_n\}\in \ell ^b\setminus \ell ^a.\) The strictness of other inclusions is left to the reader.

  13. 2.69

    Since arithmetic mean is always greatest than the harmonic mean, we obtain

    $$\begin{aligned} \frac{a_1+\dots +a_n}{n}\ge \frac{n}{\frac{1}{a_1}+\dots +\frac{1}{a_n}} \text { which implies }\sum _{k=1}^n \frac{1}{a_k}\ge \frac{n^2}{\sum _{k=1}^n a_k}\ge n^2. \end{aligned}$$
  14. 2.70

    Use \(p=2=q.\)

  15. 2.71

    Write \(a:=\Vert x\Vert _p\) and \( b:=\Vert y\Vert _p.\) The result is trivial, if \(a=0\) or \(b=0.\) Suppose not. Write \( x=\{x_n\Vert , y=\{y_n\} \) and \(c:=a/(a+b).\) Then for all \(n\in \mathbb {N},\) we have

    $$\begin{aligned} |x_n+y_n|^p &\le (|x_n|+|y_n|)^p = (a+b)^p \bigg (\frac{a}{a+b}.\frac{|x_n|}{\Vert x\Vert _p}+\frac{b}{a+b}.\frac{|y_n|}{\Vert y\Vert _p}\bigg )^p\\ & = (a+b)^p \bigg (c\frac{|x_n|}{\Vert x\Vert _p}+(1-c).\frac{|y_n|}{\Vert y\Vert _p}\bigg )^p\\ & \le (a+b)^p \bigg (c\frac{|x_n|^p}{\Vert x\Vert _p^p}+(1-c).\frac{|y_n|^p}{\Vert y\Vert _p^p}\bigg ), \end{aligned}$$

    using the convexity of the map \(t\longmapsto t^p\) on \((0,\infty ).\) Passing summation \(\sum _{n=1}^\infty \) above, we conclude that \(\Vert x+y\Vert _p^p\le (a+b)^p =( \Vert x\Vert _p+\Vert y\Vert _p)^p.\)

  16. 2.72

    The result is trivial if \(x= 0.\) Suppose \(x\ne 0.\) By Theorem 2.36, we already have \(\Vert x\Vert _\infty \le \Vert x\Vert _p\text { for all }p> 1.\) Therefore, \(\Vert x\Vert _\infty \le \liminf _{p\longrightarrow \infty }\Vert x\Vert _p.\) Let pq be conjugate exponents such that \(q<p.\) Writing \(x:=\{x_n\},\) we obtain

    $$\begin{aligned} \Vert x\Vert _p=\bigg (\sum _{n=1}^\infty |x_n|^{p-q}|x_n|^{q}\bigg )^{\frac{1}{p}}\le \Vert x\Vert _\infty ^{\frac{p-q}{p}}\bigg (\sum _{n=1}^\infty |x_n|^{q}\bigg )^{\frac{1}{p}} =\Vert x\Vert _\infty ^{1-\frac{q}{p}}\Vert x\Vert _q^{\frac{q}{p}}. \end{aligned}$$
    (2.8)

    Therefore, we have

    $$\begin{aligned} \limsup _{p\longrightarrow \infty }\Vert x\Vert _p\le \limsup _{p\longrightarrow \infty }(\Vert x\Vert _\infty ^{1-\frac{q}{p}}\Vert x\Vert _q^{\frac{q}{p}})=\Vert x\Vert _\infty . \end{aligned}$$
    (2.9)

    Finally from (2.8) and (2.9), we conclude that

    $$\displaystyle \limsup _{p\longrightarrow \infty }\Vert x\Vert _p \le \Vert x\Vert _\infty \le \displaystyle \liminf _{p\longrightarrow \infty }\Vert x\Vert _p.$$
  17. 2.73

    No. For example, let \(x_n=1\text { for all }n\in \mathbb {N}.\) Then \(\{x_n\}\in \ell ^\infty \setminus \displaystyle \bigcup _{1\le p<\infty } \ell ^p.\)