Keywords

4.1 Introduction

Information Theory provides one of its strongest developments via the notion of maximum bit rate or channel capacity. Determining an ultimate limit to the rate at which we can reliably transmit information over a physical medium in a given environment is an earnest attempt of fundamental and practical consideration. Such a limit is referred to as the channel capacity and the process of evaluating this limit leads to an understanding of the technical solutions required to approach it. Therefore, if the capacity can be found, then the goal of the engineer is to design an architecture which achieves that capacity. Capacity evaluations require information theory which must be adapted to the specific characteristics of the channel under study. The seminal work of Shannon published in 1948 [1] gave birth to information theory. Shannon determined the capacity of memoryless channels, including channels impaired by additive white Gaussian noise (AWGN) for a given signal-to-noise ratio (SNR). However, applying concepts of information theory to the optical communications channels encounters major challenges. The most important difficulty is dealing with the simultaneous interaction of specifically: The noise, filtering, and Kerr nonlinearity phenomena in the optical channel. These phenomena are distributed along the propagation path, and influence each other leading to deterministic as well as stochastic impairments [2].

Therefore, in this chapter, we accomplish an information-theoretic approach to derive the closed form expressions for the capacity of the SISO Poisson channel already found by Kabanov [3] and Davis [4], as well as for the k-user MAC Poisson channel using a direct detection or photon counting receiver and under constant noise; therefore, we simplify the framework of derivation. Several contributions have been done using information theoretic approaches to derive the capacity of Poisson channels under constant and time varying noise via martingale processes [37], or via approximations using Bernoulli processes [8], to define upper and lower bounds for the capacity and the rate regions of different models [9, 10], to define relations between information measures and estimation measures [11], in addition to deriving optimum power allocation for such channels [6, 7, 12]. However, in this contribution, we introduce a simple framework for deriving the capacity of Poisson channels for the model of consideration—The MAC Poisson channel—with the assumption of constant stochastic martingale noise, i.e. for the sake of simplicity, we didn’t model the noise as Gaussian within the stochastic intensity rate process. In addition, we build upon derivations for the optimal power allocation.

In Poisson channels, the shot noise is the dominant noise whenever the power received at the photodetector is high; such noise is modeled as a Poisson random process. In fact, such framework has been investigated in many researches; see [27, 913]. Capitalizing on the expressions derived on [3, 4, 6, 7] and on the results by [6, 7, 9], we investigate the derivation process of the channel capacity in a straightforward way; we then determine the optimal power allocation that maximizes the information rates. To derive the optimal power allocation for different channel frameworks, it’s worth to notice that different optimization criteria could be relevant. In particular, the optimization criteria could be the peak power, the average optical power, or the average electrical power. The average electrical power is the standard power measure in digital and wireless communications and it helps in assessing the power consumption in optical communications, while the average optical power is an important measure for safety considerations and helps in quantifying the impact of shot noise in wireless optical channels. In addition, the peak power, whether electrical or optical, gives a measure of tolerance against the nonlinearities in the system, for example the Kerr nonlinearity which is identified by a nonlinear phase delay in the optical intensity or in other words as the change in the refractive index of the medium as a function of the electric field intensity.

4.2 The Communication Framework

In a communication framework, the information source inputs a message to a transmitter. The transmitter couples the message onto a transmission channel in the form of a signal which matches the transfer properties of the channel. The channel is the medium that bridges the distance between the transmitter and the receiver. This can be either a guided transmission such as a wire or a wave guide, or it can be an unguided free space channel. A signal traverses the channel will suffer from attenuation and distortion. For example, electric power can be lost due to heat generation along a wire, and optical power can be attenuated due to scattering and absorption by air molecules in a free space. Therefore, channels are characterized by a transfer function which models the input–output process. The input–output process statistics is dominated by the noise characteristics the modulated input experiences during its propagation along the communication medium, in addition to the detection procedure experienced at the channel output. In particular, when the noise \( n_{G} \left( t \right) \) is a zero-mean Gaussian process with double-sided power spectral density N0/2, the channel is called an additive white gaussian channel (AWGN). However, when the electrical input is modulated by a light source, like a laser diode, the channel will be an optical channel with the dominant shot noise \( n_{d} \left( t \right) \) arising from the statistical nature of the production and collection of photoelectrons when an optical signal is incident on a photodetector, such statistics characterized by a Poisson random process.

Figure 4.1 illustrates both the AWGN and the Poisson optical channels. In this chapter, we focus on the Poisson optical communication channel shown in Fig. 4.1b and we derive capacity closed form expression for the MAC Poisson channel capitalizing on the framework of derivation of the SISO Poisson channel capacity under a constant shot noise.

Fig. 4.1
figure 1

a The AWGN channel. b The Poisson optical channel

4.3 The SISO Poisson Channel

Consider the SISO Poisson channel P shown in Fig. 4.2. Let \( N(t) \) represent the channel output, which is the number of photoelectrons counted by a direct detection device (photodetector) in the time interval [0, T]. \( N(t) \) has been shown to be a doubly stochastic Poisson process with instantaneous average rate \( \lambda (t) + n \). The input \( \lambda (t) \) is the rate at which photoelectrons are generated at time t in units of photons per second. And \( n \) is a constant representing the photodetector dark current and background noise.

Fig. 4.2
figure 2

The SISO Poisson channel model

4.3.1 Derivation of the Capacity of SISO Poisson Channels

Let \( p\left( {N_{T} } \right) \) be the sample function density of the compound regular point process \( N(t) \) and \( p\left( {N_{T} |S_{T} } \right) \) be the conditional sample function of \( N(t) \) given the message signal process \( S(t) \) in the time interval [0, T]. Then we have,

$$ p\left( {N_{T} |S_{T} } \right) = e^{{ - \int\limits_{0}^{T} {\left( {\lambda (t) + n} \right)d(t)} + \int\limits_{0}^{T} {\log \left( {\lambda (t) + n} \right)dN(t)} }} $$
(4.1)
$$ p\left( {N_{T} } \right) = e^{{ - \int\limits_{0}^{T} {\left( {\widehat{\lambda (t)} + n} \right)d(t)} + \int\limits_{0}^{T} {\log \left( {\widehat{\lambda (t)} + n} \right)dN(t)} }} $$
(4.2)

We use the following consistent notation in the paper, \( \widehat{\lambda (t)} \) is the estimate of the input \( \lambda (t) \). \( {\mathbb{E}}[ \cdot ] \) is the expectation operation over time. Therefore, the mutual information is defined as follows,

$$ I\left( {S_{T} ;N_{T} } \right) = {\mathbb{E}}\left[ {\log \left( {\frac{{p\left( {N_{T} |S_{T} } \right)}}{{p\left( {N_{T} } \right)}}} \right)} \right] $$
(4.3)

Theorem 1 (Kabanov’78 [1]-Davis’80 [4]):

The capacity of the SISO Poisson channel is given by:

$$ C = \frac{K}{P}(P + n)\log (P + n) + \left( {1 - \frac{K}{P}} \right)n\log (n) - (K + n)\log (K + n) $$
(4.4)

Proof :

Substitute (4.1, 4.2) in (4.3), we have,

$$ I\left( {S_{T} ;N_{T} } \right) = {\mathbb{E}}\left[ { - \int\limits_{0}^{T} {\lambda (t) - \widehat{\lambda (t)}dt} + \int\limits_{0}^{T} {\log \left( {\frac{\lambda (t) + n}{{\widehat{\lambda (t)} + n}}} \right)} dN(t)} \right] $$

Since \( {\mathbb{E}}\left[ {\widehat{{\lambda ( {\text{t)}}}}} \right] = {\mathbb{E}}\left[ {{\mathbb{E}}\left[ {\lambda ( {\text{t)}}|{{N(t)}}} \right]} \right] = {\mathbb{E}}\left[ {\lambda ( {\text{t)}}} \right] \), it follows that,

$$ I\left( {S_{T} ;N_{T} } \right) = {\mathbb{E}}\left[ {\int\limits_{0}^{T} {\log \left( {\frac{\lambda (t) + n}{{\widehat{\lambda (t)} + n}}} \right)} dN(t )} \right] $$

And \( N(t) - \int\limits_{ 0}^{\text{T}} {{ \log }\left( {\lambda (t) + n} \right)} \) is a martingale from theorems of stochastic integrals, see [6, 11] therefore,

$$ I\left( {S_{T} ;N_{T} } \right){ = }{\mathbb{E}}\left[ {\int\limits_{ 0}^{T} {\left( {\lambda (t) + n} \right){ \log }\left( {\frac{{ {\lambda (t) + n} }}{{ {\widehat{{\lambda ( {{t)}}}}{{ + n}}} }}} \right)dt} } \right] $$
$$ = \int\limits_{ 0}^{\text{T}} {{\mathbb{E}}\lceil {\left( {\lambda (t)\; + \;n} \right){ \log }\left( {\lambda (t)\; + \;n} \right)} \rceil - {\mathbb{E}}\lceil{\left( {\lambda (t)\; + \;n} \right){ \log }\left( {\widehat{\lambda (t)}\; + \;n} \right)} dt}\rceil $$
$$ = \int\limits_{ 0}^{\text{T}} {{\mathbb{E}}\lceil{\left( {\lambda (t) + n} \right){ \log }\left( {\lambda (t) + n} \right)} \rceil-{\mathbb{E}}\lceil {{\mathbb{E}}[ {\left( {\lambda (t) + n} \right)} ]{ \log }\left( {\widehat{\lambda (t)} + n} \right)|N_{T} } dt}\rceil $$
$$ = \int\limits_{ 0}^{\text{T}} {{\mathbb{E}}\lceil {\left( {\lambda (t) + n} \right){ \log }\left( {\lambda (t) + n} \right)} \rceil - {\mathbb{E}}\lceil {{\mathbb{E}} {\left( [{\lambda (t) + n}] \right)|N_{T} } ]} \rceil{ \log }\left( {\widehat{\lambda (t)} + n} \right)dt} $$
$$ = \int\limits_{ 0}^{\text{T}} {{\mathbb{E}}\lceil {\left( {\lambda (t) + n} \right){ \log }\left( {\lambda (t) + n} \right)} \rceil - {\mathbb{E}}\lceil {\left( {\widehat{\lambda (t)} + n} \right){ \log }\left( {\widehat{\lambda (t)} + n} \right)} dt}\rceil $$
(4.5)

See [6, 7] for similar steps. In [11], it has been shown that the derivative of the input–output mutual information of a Poisson channel with respect to the intensity of the dark current is equal to the expected error between the logarithm of the actual input and the logarithm of its conditional mean estimate, it follows that,

$$ \frac{{dI\left( {S_{T} ;N_{T} } \right)}}{d\lambda (t)}{ = }{\mathbb{E}}\left[ {{ \log }\left(\frac{\lambda (t) + n}{{\widehat{\lambda (t)} + n}}\right)} \right] $$
(4.6)

The right hand side term of (4.6) is the derivative of the mutual information corresponding to the integration of the estimation errors. This plays as a counter part to the well known relation between the mutual information and the minimum mean square error (MMSE) in Gaussian channels in [14].

The capacity of the SISO Poisson channel given in Theorem 1 (4.4) is defined as the maximum of (4.5) solving the following optimization problem,

$$ \max I\left( {S_{T} ;N_{T} } \right) $$
(4.7)

Subject to average and peak power constraints,

$$ \frac{ 1}{T}{\mathbb{E}}\left[ {\int\limits_{0}^{T} {\lambda (t)dt} } \right] \le \sigma P $$
$$ 0 \le \lambda (t) \le P $$
(4.8)

With P is the maximum power and the ratio of average to peak power \( \sigma \) is used with \( 0 \le \sigma \le 1 \). We can easily check that the mutual information is strictly convex via it \( \lambda (t ) \)s second derivative with respect to as follows,

$$ \frac{{d^{2} I\left( {S_{T} ;N_{T} } \right)}}{{d\lambda^{2} (t)}}{\text{ = log}}\left( {\frac{\lambda (t) + n}{{\widehat{\lambda (t)} + n}}} \right) > 0 .$$

Therefore, the mutual information is convex with respect to \( \lambda (t ) .\)

Now solving:

$$ \max \left( {\int\limits_{0}^{T} {{\mathbb{E}}\lceil {\left( {\lambda (t) + n} \right)\log \left( {\lambda (t) + n} \right)} \rceil - {\mathbb{E}}\lceil {\left( {\widehat{\lambda (t)} + n} \right)\log \left( {\widehat{\lambda (t)} + n} \right)} \rceil - \frac{\xi }{T}{\mathbb{E}}\lceil {\lambda (t)} \rceil} } \right) $$

With \( \xi \) as the Lagrangian multiplier.

The possible values of \( {\mathbb{E}}\lceil {(\lambda (t) + n)\log (\lambda (t) + n)} \rceil \) must lie in the set of all y-coordinates of the closed convex hull of the graph \( {\text{y}} = \left( {{\text{x}} + {\text{n}}} \right)\log \left( {{\text{x}} + {\text{n}}} \right) \). Hence, the maximum mutual information achieved using the distribution \( p(\lambda = P) = 1 - p(\lambda = 0) = \alpha \). Where \( 0 \le \alpha \le 1 \), so that \( {\mathbb{E}}[\lambda (t)] = K \). So, we must have \( {\mathbb{E}}\left[ {\lambda (t)} \right] = \sum {\lambda (t)p(\lambda )} \). It follows that, \( K = Pp(\lambda = P) = P\alpha \). Then, \( \alpha = \frac{k}{P} \) and the capacity in Theorem 1 (4.4) is proved.

4.3.2 Optimum Power Allocation for SISO Poisson Channels

We need to solve the following optimization problem,

$$ \max \left(\frac{K}{P}(P + n)\log (P + n) + (1 - \frac{K}{P})n\log (n) - (K + n)\log (K + n) - \frac{\xi }{T}K\right) $$
(4.9)

Since (4.9) is concave with respect to \( K \), i.e. the second derivative of (4.9) with respect to \( K \) is negative. Using the Lagrangian corresponding to the derivative of the objective with respect to \( K \), and the Karush–Kuhn–Tucker (KKT) conditions, the optimal power allocation is the following,

$$ K^{*} = (P + n)e^{{ - \left( {1 + \frac{\xi }{T}} \right) + \frac{n}{P}\log \left( {1 + \frac{P}{n}} \right)}} - n $$
(4.10)

4.4 The MAC Poisson Channel

Consider the MAC Poisson channel shown in Fig. 4.3. Let us consider a 2-input MAC Poisson channel, then, \( N_{1} (t) \) is a doubly stochastic Poisson process with instantaneous average rate \( \lambda_{1} (t) + \lambda_{2} (t) + n \).

Fig. 4.3
figure 3

The MAC Poisson channel model

4.4.1 Derivation of the Capacity of MAC Poisson Channels

Let \( p(N_{1} ) \) and \( p(N_{1} |S_{1} ,S_{2} ) \) be the joint density and conditional sample function of the compound regular point process \( N_{1} (t) \) given the message signal processes \( S_{1} (t) \) in the time interval [0, T]. Then we have,

$$ p\left( {N_{1} |S_{1} ,S_{2} } \right) = e^{{ - \int\limits_{0}^{\text{T}} {(\lambda_{1} ({\text{t}}) + \lambda_{2} ({\text{t}}) + n){\text{dt}} + \int\limits_{0}^{\text{T}} {\log (\lambda_{1} ({\text{t}}) + \lambda_{2} ({\text{t}}) + n)dN({\text{t}})} } }} $$
(4.11)
$$ p(N_{1} ) = e^{{ - \int\limits_{0}^{\text{T}} {(\widehat{{\lambda_{1} ({\text{t}})}} + \widehat{{\lambda_2({\text{t}})}} + n){\text{dt}} + \int\limits_{0}^{\text{T}} \log \left( {\widehat{\lambda _{1} ({\text{t}})}} + \widehat{{\lambda_ 2({\text{t}})}} + n\right)dN({\text{t}})} } } $$
(4.12)

Therefore, the mutual information is defined as follows,

$$ I\left( {S_{T} ;N_{T} } \right) = {\mathbb{E}}\left[ {\log \left( {\frac{{p\left( {N_{1} |S_{1} ,S_{2} } \right)}}{{p(N_{1} )}}} \right)} \right] $$
(4.13)

Theorem 2:

The capacity of the 2-input MAC Poisson channel is given by:

$$ C = \left( {\frac{k1}{P} + \frac{k2}{P}} \right)\left( {P + n} \right)\log \left( {P + n} \right) + \left( {1 - \left( {\frac{K1}{P} + \frac{K2}{P}} \right)} \right)n\log (n) - (K1 + K2 + n)\log (K1 + K2 + n) $$
(4.14)

Proof:

Substitute (4.11, 4.12) in (4.13), we have,

$$\begin{aligned} I\left( {S_{\text{T}} ;N_{\text{T}} } \right) =& {\mathbb{E}}\left[ - \int\limits_{0}^{T} \left( {\lambda 1(t) - \widehat{\lambda 1(t)}} \right)dt - \int\limits_{0}^{T} {\left( {\lambda 2(t) - \widehat{\lambda 2(t)}} \right)dt} \right. \\ & \left. + \int\limits_{0}^{\text{T}} {\log \left( {\frac{\lambda 1(t) + \lambda 2(t) + n}{{\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n}}} \right)dN(t)} \right] \end{aligned} $$

Since \( {\mathbb{E}}\left[ {\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)}} \right] = {\mathbb{E}}\left[ {{\mathbb{E}}\left[ {\lambda 1(t) + \lambda 2(t)|N_{T} } \right]} \right] = {\mathbb{E}}[\lambda 1(t) + \lambda 2(t)] \), it follows that,

$$ I\left( {S_{\text{T}} ;N_{\text{T}} } \right) = {\mathbb{E}}\left[ {\int\limits_{0}^{\text{T}} {\log \left( {\frac{\lambda 1(t) + \lambda 2(t) + n}{{\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n}}} \right)dN(t)} } \right] $$

And \( N(t) - \int\limits_{0}^{T} {\log (\lambda 1(t) + \lambda 2(t) + n)} \) is a martingale from theorems of stochastic integrals, see [6, 11] therefore,

$$ I\left( {S_{\text{T}} ;N_{\text{T}} } \right) = {\mathbb{E}}\left[ {\int\limits_{0}^{\text{T}} {(\lambda 1(t) + \lambda 2(t) + n)\log \left( {\frac{\lambda 1(t) + \lambda 2(t) + n}{{\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n}}} \right)dt} } \right] $$
$$ \begin{aligned}= \int\limits_{0}^{T} {{\mathbb{E}}\lceil {(\lambda 1(t) + \lambda 2 (t) + n)\log (\lambda 1(t) + \lambda 2(t) + n)} \rceil} \hfill \\ - {\mathbb{E}}\lceil {(\lambda 1(t) + \lambda 2(t) + n)\log (\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n)}\rceil dt \end{aligned}$$
$$ \begin{aligned}= \int\limits_{0}^{T} {{\mathbb{E}}\lceil {(\lambda 1(t) +\lambda 2 (t) + n)\log (\lambda 1(t) + \lambda 2(t) + n)} \rceil}\hfill \\ - {\mathbb{E}}\lceil {{\mathbb{E}}\left[ {(\lambda 1(t) + \lambda 2 (t) + n)} \right]\log (\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n)|N_{T} } \rceil dt \end{aligned}$$
$$ \begin{aligned}= \int\limits_{0}^{T} {{\mathbb{E}}\lceil {(\lambda 1(t) + \lambda 2(t) + n)\log (\lambda 1(t) + \lambda 2(t) + n)} \rceil} \hfill \\- {\mathbb{E}}\lceil {{\mathbb{E}}\lceil {(\lambda 1(t) + \lambda 2 (t) + n)|N_{T} } }\log (\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n) \rceil dt \end{aligned} $$
$$ \begin{aligned}= \int\limits_{0}^{T} {{\mathbb{E}}\lceil {(\lambda 1(t) + \lambda 2(t) + n)\log (\lambda 1(t) + \lambda 2(t) + n)} \rceil} \hfill \\- {\mathbb{E}}\lceil {{\mathbb{E}}\lceil {(\lambda 1(t) + \lambda 2 (t) + n)|N_{T} } }\log (\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n) \rceil dt\end{aligned} $$
(4.15)

The capacity of the MAC Poisson channel given in Theorem 2 (4.14) is defined as the maximum of (4.15) solving the following optimization,

$$ \max I(S_{T} ;N_{T} ) $$
(4.16)

Subject to average and peak power constraints,

$$ \frac{1}{T}{\mathbb{E}}\left[ {\int\limits_{0}^{T} {(\lambda 1(t) + \lambda 2(t))} dt} \right] \le \sigma P $$
$$ 0 \le \lambda 1(t) \le P1 $$
$$ 0 \le \lambda 2(t) \le P2 $$
(4.17)

With \( P1 \) and \(P2 \) are the maximum power and the ratio of average to peak power \( \sigma \) is used with \( 0 \le \sigma \le 1 \). Now, solving:

$$ \begin{aligned} {}& \max \left( \int\limits_{0}^{\text{T}} {\mathbb{E}}\lceil {(\lambda 1(t) + \lambda 2(t) + n)\log (\lambda 1(t) + \lambda 2(t) + n)} \rceil - {\mathbb{E}}\lceil (\widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n) \right. \\ & \left. \log \left( \widehat{\lambda 1(t)} + \widehat{\lambda 2(t)} + n \right) \rceil - \frac{\xi }{\text{T}}{\mathbb{E}}[ {\lambda 1(t) + \lambda 2(t)} ] \right), \end{aligned} $$

with \( \xi \) as the Lagrangian multiplier. The possible values of \( {\mathbb{E}}\lceil {(\lambda 1(t) + \lambda 2(t) + n)\log (\lambda 1(t) + \lambda 2(t) + n)} \rceil \) must lie in the set of all y-coordinates of the closed convex hull of the graph \( y = (x1 + x2 + n)\log (x1 + x2 + n) \). Suppose that the maximum power for both inputs is \( P1 + P2 = \sigma P \). Hence, the maximum mutual information achieved using the distribution \( p(\lambda = P) = 1 - p(\lambda = 0) = \alpha \). Where \( 0 \le \alpha \le 1 \) so that \( {\mathbb{E}}[\lambda 1(t)] = K1 \), \( {\mathbb{E}}[\lambda 2(t)] = K \). So, we have \( {\mathbb{E}}[\lambda 1(t) + \lambda 2(t)] = \sum {(\lambda 1(t)p(\lambda 1) + (\lambda 2(t)p(\lambda 2)} \). It follows that, \( K1 = Pp(\lambda 1 = P) = P\alpha. \) \( K2 = Pp(\lambda 2 = P) = P(1 - \alpha ) \). Then, \( \alpha = \frac{k1}{P} \) and \( 1 - \alpha = \frac{k2}{P} \) and then the capacity in Theorem 2 (4.14) is proved and can be maximized when \( \frac{k1}{P} = \frac{k2}{P} \).

It’s worth to note that we also have \( K3 = P1p\left( {0 \le \lambda 1(t) \le \sigma P} \right) + P2p\left( {0 \le \lambda 2(t) \le \sigma P} \right) = P1\alpha + P2(1 - \alpha ) \), however, \( {\text{K3}} \) is not considered in the capacity equations since we only need the maximum and the minimum powers for both \( \lambda 1(t) \) and \( \lambda 2(t) \) to get the maximum expected value. Therefore, our framework of derivation differs from [9] by solving the problem geometrically.

4.4.2 Optimum Power Allocation of MAC Poisson Channels

We need to solve the following optimization problem,

$$ \max \left( {\left( {\frac{K1}{P} + \frac{K2}{P}} \right)\left( {P + n} \right)\log \left( {P + n} \right) + \left( {1 - \left( {\frac{K1}{P} + \frac{K2}{P}} \right)} \right)n{\text{log(}}n) - (K1 + K2 + n)\log (K1 + K2 + n) - \frac{\xi }{T}(K1 + K2)} \right) $$
(4.18)

Using the Lagrangian corresponding to the derivative of the objective with respect to K, and the Karush–Kuhn–Tucker (KKT) conditions, the optimal power allocation is the solution of the following equation,

$$ K1^{*} + K2^{*} = (P + n)e^{{ - \left( {1 + \frac{\xi }{T}} \right) + \frac{n}{P}\log \left( {1 + \frac{P}{n}} \right)}} - n $$
(4.19)

The optimum power allocation solution introduces the fact that orthogonalizing the inputs via time or frequency sharing will achieve the capacity; therefore, it follows the importance for interface solutions to aggregate different inputs to the Poisson channel.

4.5 MAC Poisson Channel Capacity and Rate Regions

We dedicate this section to analyze the result of Theorem 2. We will introduce the two-user MAC Poisson channel rate regions and we will then define the MAC capacity with respect to the SISO capacity and to bounds found mainly in [9]. The rate regions for the two-user MAC Poisson channel is given by,

$$ R1 \le I(S_{1} ;N_{1} |S_{2} ) $$
(4.20)
$$ R2 \le I(S_{2} ;N_{1} |S_{1} ) $$
(4.21)
$$ R1 + R2 \le I(S_{1} ,S_{2} ;N_{1} ) $$
(4.22)

The mutual information that defines the sum of the rates \( I(S_{1} ,S_{2} ;N_{1} ) \) is defined in [Eq. 3.21, 9] under the condition that the average inputs for the two users are equal; in particular when both inputs are equiprobable. Here, we can manipulate this result into a sum rate upper bound with the two users having different average input powers as follows,

$$ I(S_{1} ,S_{2} ;N_{1} ) = \left( {\frac{K1}{P} + \frac{K2}{P}} \right)\left( {P + n} \right)\log \left( {P + n} \right) + \left( {1 - \left( {\frac{K1}{P} + \frac{K2}{P}} \right)} \right)n\log (n) - (K1 + K2 + n)\log (K1 + K2 + n) - 2\left( {\frac{{K1^{2} }}{{P^{2} }} + \frac{{K2^{2} }}{{P^{2} }}} \right)\left( {P + n} \right)\log \left( {P + n} \right) + \left( {\frac{K1K2}{{P^{2} }}} \right)\left( {2P + n} \right)\log \left( {2P + n} \right) + \left( {\frac{K1K2}{{P^{2} }}} \right)n\log (n) $$
(4.23)

Where, \( I(S_{1} ,S_{2} ;N_{1} ) \) is maximized when \( \frac{K1}{P} = \frac{K2}{P} \). It is important to notice that the first non-quadratic terms of \( I(S_{1} ,S_{2} ;N_{1} ) \) is the capacity of the SISO Poisson channel with the input as \( \lambda 1(t) + \lambda 2(t) \). Therefore, we can see through Theorem2 that the capacity is approximately defined by the first term of \( I(S_{1} ,S_{2} ;N_{1} ) \),

$$ I(S_{1} ,S_{2} ;N_{1} ) = C_{SISO} (\lambda _1 + \lambda_2) + \beta $$
(4.24)

Where,

$$ \beta = - 2\left( {\frac{{K1^{2} }}{{P^{2} }} + \frac{{K2^{2} }}{{P^{2} }}} \right)\left( {P + n} \right)\log \left( {P + n} \right) + \left( {\frac{K1K2}{{P^{2} }}} \right)\left( {2P + n} \right)\log \left( {2P + n} \right) + \left( {\frac{K1K2}{{P^{2} }}} \right)n\log (n) $$
(4.25)

Therefore, we can deduce that the rate region as defined in [9] is an upper bound for the capacity, and thus we can write an empirical form for the k-user MAC Poisson capacity, using the first non-quadratic terms of the above equation as follows,

$$ C_{k - \text{user MAC}} = C_{\text{SISO}} (\lambda 1 + \ldots + \lambda k) $$
(4.26)

We can also verify Theorem 2 comparing it to the results in [9] for different setups, for example, consider the case when K1 = K2 = K, the capacity will be, \( C = 2\frac{K}{P}(P + n)\log (P + n) + \left( {1 - \frac{2K}{P}} \right)n\log (n) - (2K + n)\log (2K + n) .\)

When \( K1 = K2 = K = P \), the negative terms indicates a zero capacity,

\( C = 0 \), and when \( K1 = K2 = K \ne P \), and \( n = 0 \) the capacity will be,

\( C = 2K\log \frac{P}{2K} \), and the rate sum will be, \( I(S_{1} ,S_{2} ;N_{1} ) = 2K\log \frac{P}{2K} + 2\frac{{K^{2} }}{P}\log 2 \). Therefore, \( I(S_{1} ,S_{2} ;N_{1} ) \) given in [Eq. 3.21, 9] upper bounds the capacity by the term \( 2\frac{{{K}^{2} }}{{P}}\log 2 \), and via the constraints over the average power, \( 2\frac{{K^{2} }}{{P}}\log 2 \le 2\log 2 \) it follows that this upper bounds the capacity with a value always less than or equal to 1.4 nats/sec for the two-user MAC. In a more generalized way, the empirical form differs from the upper bound by less than or equal to \( {k\,{\text{log}}\,k} \), where \( {k} \) corresponds to the number of inputs/users to the MAC Poisson channel. We can also verify that the maximum capacity achieved by orthognalizing the inputs such that the capacity approaches \( {P/e} \) nats/sec for each user. Therefore, non-orthognalizing the inputs incurs a maximum of around 0.5256 P power loss in the two-user MAC case. This well explains the limitation in the number of users for the MAC Poisson channel.

Figure 4.4 shows the capacity of different Poisson channels under a total power constraint of \( P\;{ = }\; 5 \) on the SISO channel and each user’s input of the parallel channel and the MAC channel, an equal average input power \( {K1}\;{ = }\;{K2}\;{ = }\;{K} \), and shot noise \( n = 0.1 \). When the average input power is around one quarter the total power \( K = {P}/{4} \), the rate is the maximum achievable rate, this explains the power loss in the two-user MAC case explained before. We can notice that the maximum mutual information presented by Lapidoth et al. in [Eq. 3.21, 9] upper bounds the rate region of all given channels, however, we can see that the maximum achievable rate is always \( {C} \le {P}/{e} \) nats/sec. In particular, for the MAC channel the maximum achievable rate with total power \( P = 10 \) is 3.425 nats/sec which is \( C \le {10}/{e} \le 3.7037 \) nats/sec, i.e. the capacity for the k-user MAC is always \( {C} \le {kP/e}\). We can further see that in the low average power regime, both the upper bound and the empirical capacity of the MAC matches, while it logarithmically differs at the high average power regime, this is due to the quadratic part that is missed in the empirical capacity formula denoted by β. Notice also that the MAC channel under the given conditions upper bounds the parallel channel, or in other words the parallel channel defines a lower bound over the MAC when both inputs are active.

Fig. 4.4
figure 4

Capacity of the Poisson channels versus the average power

Figure 4.5 shows the capacity of different Poisson channels with respect to the noise where naturally the capacity decreases with respect to the increase in the shot noise. However, it is of particular relevance to notice that in the low noise power regime, that Lapidoth upper bound for the MAC maximum achievable rate [9] indeed cannot be achieved due to existence of the quadratic terms, this gives rise of the achieved capacity over the right one, \( {C} \le {kP}/{e} \) however, our empirical form of the MAC capacity shows consistency regarding this relation and can be generalized to k-users.

Fig. 4.5
figure 5

Capacity of the Poisson channels versus the shot noise

4.6 Discussion

The solutions provided in this chapter show that the capacity of Poisson channels is a function of the average and peak power of the input. As a normal consequence to the expressions of the SISO Poisson channel, the Poisson parallel channels throughput is the sum of their independent SISO channels, proof is provided in [7]. For the MAC Poisson channel, the capacity expression derived here gives a generalization of a closed form expression for the k-user MAC Poisson channel. The authors in [9] studied the capacity regions of the two-user MAC Poisson channels.

They also pointed out an interesting observation that we can also emphasize and verify via Theorem 2; that is; in contrary to the Gaussian MAC, in the Poisson MAC the maximum throughput is bounded in the number of inputs, and similar to the Gaussian MAC in terms of achieving the capacity via orthogonalizing the inputs or via the usage of a limited average input power for each user that is equal to one quarter the total power in the two-user MAC case. In fact, for the Poisson MAC, when equal input powers up to half the total power for each are used, the capacity faces a decay to zero, while when they differ i.e. inputs are orthogonal, the capacity is again maximized. In addition, we can also verify that the two main factors in the MAC capacity is the orthogonalization and the maximum power, while increasing the average power for one or the two inputs above a certain limit will not add positively to the capacity, see [7]. We can also see that the maximum power is a function of the average power through which both can be optimized to maximize the capacity.

Moreover, it can be deduced via the mathematical formulas that the power allocation is a decreasing value with respect to the dark current for all Poisson channels. It means that the power allocation for the Poisson channels in some way or another follows a waterfilling alike interpretation to the one for the Gaussian setup where less power is allotted to the more noisy channels [7, 15]. However, it’s well known that the optimum power allocation is an increasing function in terms of the maximum power.

4.6.1 Gaussian Channels Versus Poisson Channels

Here, we summarize some important points about the capacity of Poisson channels in comparison to Gaussian channels within the context of this work. Firstly, in comparison to the Gaussian capacity, the channel capacity of the Poisson channel is maximized with binary inputs, i.e. [0, 1], while the distribution that achieves the Gaussian capacity is a Gaussian input distribution. Secondly, the maximum achievable rates for the Poisson channel is a function of its maximum and average powers due to the nature of the Poisson process which follows a stochastic random process with martingale characteristics, while in Gaussian channels, the processes are random and modeled by the normal distribution. Thirdly, the optimum power allocation for the Poisson channels is very similar for different models depending on the defined power constraints, and in comparison to the Gaussian optimum power allocation; it follows a similar interpretation to the waterfilling, at which more power is allocated to stronger channels, i.e. power allocation is inversely proportional to the more noisy channel. However, although the optimal inputs distribution for the Poisson channel is a binary input distribution, the optimal power allocation is a waterfilling alike, i.e. unlike the Gaussian channels with arbitrary inputs where it follows a mercury-waterfilling interpretation to compensate for the non-Gaussianess in the binary input [16]. Finally, it is worth to emphasize two more important differences that were already shown in [10], which can be straight forward to proof here: Unlike the Gaussian channels, in Poisson channels, due to the characteristics of the Poisson distribution, we cannot implement interference cancellation techniques, since it is not possible to construct the probability of \( p(N_{1} = \lambda{1} + n) \) from the probability \( p(N_{1} = \lambda{1} + \lambda{2} + n) \) if \( \lambda 2 \) is considered as an interferer to \( \lambda 1 \). Besides, unlike Gaussian channels, Poisson channels are scale-invariant, since \( p(N_1 = \lambda{1} + {n}/{a})\neq p(N_1 = a\lambda{1} + n) \), if a scaling factor \(a \neq 1\) is multiplied to the inputs, the mutual information \( I(S_1,\;S_2;\;N_1 =a \lambda{1} + a \lambda{2}+{n})\neq I(S_1,\;S_2;\;N_1 = \lambda{1} + \lambda{2}+{n/a}) \).

4.7 Conclusions

In this chapter, we show via an information theoretic approach that the capacity of optical Poisson channels is a function of the average and maximum power of the inputs, the capacity expressions have been derived as well as the optimal power allocation for the SISO and the MAC channel models. We provide a closed form expression for the k-user MAC Poisson channel with any average input powers. It is shown-through the limitation on users within the capacity of the Poisson MAC- that the interface solutions for the aggregation of multiple users/channels over a single Poisson channel are of great importance. However, a technology like orthogonal frequency division multiplexing (OFDM) for optical communications stands as one interface solution. While it introduces attenuation via narrow filtering, etc. it therefore follows the importance of optimum power allocation which can mitigate such effects, hence, we build upon optimum power allocation derivations.