1 Introduction

Data rates of fiber-optic communication systems have been moved beyond 100 Gb/s due to the concerted employment of polarization-division multiplexing (PDM) and advanced modulation formats, assisted by advanced digital signal processing (DSP) techniques (Zhou et al. 2013). In order to accommodate the use of two polarization tributaries, polarization-demultiplexing (PolDemux) represents a key role in digital coherent receivers, allowing to compensate for time varying state of polarization (SOP) changes (Muga and Pinto 2014, 2015). The use of higher-order modulation formats, along with higher data rates, requires the employment of more complex and more computationally demanding DSP stages (Guiomar et al. 2015). Therefore, the development of low-complexity and format-transparent PolDemux algorithms becomes important to potentially alleviate the computational and energy consumption requirements of new flexible transceivers (Zhuge et al. 2014), or to facilitate the introduction of new equalizing subsystems, e.g. digital nonlinear compensation (Amado et al. 2016; Guiomar et al. 2016).

Stokes space representation of the light polarization can be employed for PolDemux of arbitrary complex modulated signals in coherent receivers (Szafraniec et al. 2010), with some interesting advantages when comparing with well-known methods based on the least-mean-square (LMS) and constant-modulus algorithms (CMA) (Savory 2010). Other applications of Stokes space representation include monitoring and compensation of polarization dependent losses (PDL) (Muga and Pinto 2013; Yu et al. 2014), cross-polarization modulation compensation (Serena et al. 2012), optical signal-to-noise ratio (OSNR) monitoring (Szafraniec et al. 2010), advanced cost-effective Stokes vector receivers (Shieh et al. 2016), and modulation format recognition (Boada et al. 2015). Among the principal advantages of the Stokes space based PolDemux (SSPolDemux) are a higher convergence ratio (Muga et al. 2013), improved robustness against phase noise (Visintin et al. 2014), and the transparency to higher-level M-ary signals (Ziaie et al. 2015, 2017). Polarization mode dispersion (PMD) is the principal fiber impairment impacting on the performance of SSPolDemux, although two scenarios are usually considered on the analysis of such effect. The first one occurs when the channel differential-group delay (DGD) induce negligible changes on the relative position between two points in the Stokes space, ensuring in that way that this PolDemux technique is still accurate (Szafraniec et al. 2010). The second regime arises when the signal samples experiment significantly different rotations. Due to the limitations imposed by PMD, SSPolDemux has in coherent access/metro networks (Shahpari et al. 2016) the most evident and potentiating application scenario, where dual-polarization techniques double the spectral density (Ziaie et al. 2017).

In this paper, we are going to numerically assess the PMD tolerance of two SSPolDemux algorithms in optical transmission supported by coherent detection. The PMD tolerance is evaluated as function of the ratio DGD versus symbol period, TSymb, allowing to define three distinct regimes: a first one (with DGD/TSymb smaller than 0.2) where the PMD effect can be neglected; a second one (with DGD/TSymb within the range [0.2–0.4]) where the PMD effect is high, but the SSPolDemux can be performed; a third one (with DGD/TSymb larger than 0.4) where the SSPolDemux is no more possible.

The remainder of this paper is organized as follows. In Sect. 2, the theory of SSPolDemux is introduced, and two alternative algorithms to compute the best fit plane are presented. The simulation description along with the numerical results are presented in Sect. 3. Finally, Sect. 4 discusses the presented results and summarizes the main conclusions of this paper.

2 Theory of Stokes space based PolDemux

The analysis of signal samples in the 3D Stokes space for SSPolDemux was proposed for first time in Szafraniec et al. (2010). This technique was demonstrated for complex-modulated signals (PDM-Quadrature phase-shift keying (PDM-QPSK) and M-ary higher-order modulations), and it is compatible and easily integrated with other functionalities and sub-systems of the digital-part of the receiver carried out in the Jones space. All possible combinations of symbols carried in the two orthogonal polarization tributaries result in a particular number of different SOPs. However, when such SOPs are represented in the Stokes space a specific form is created and a symmetry plane is always observable (Szafraniec et al. 2010). Along the fiber propagation, such plane suffers random rotations, meaning that the signals propagated in each axis of the initial referential become a combination of the data initially coded in each orthogonal SOP. As a result of the polarization mixing, the normal, \(\hat{n}=(a,b,c)^T\), will point in a random direction over the Poincaré sphere, defining the new orientation of the orthogonal polarizations used at the coding stage. The inverse matrix of the fiber rotation \({\mathbf {F}}\) is give by

$$\begin{aligned} {\mathbf {F}}= \left[ \begin{array}{cc} \cos (p)\exp (i\, q/2) &{} \sin (p)\exp (-i\,q/2)\\ -\sin (p)\exp (i\,q/2) &{} \cos (p)\exp (-i\,q/2) \end{array} \right] , \end{aligned}$$
(1)

where \(p=1/2{\mathrm {arctan}}(a,(b^2+c^2)^{1/2})\), and \(q={\mathrm {arctan}}(b,c)\) (Szafraniec et al. 2010).

Initial works on SSPolDemux employed simple well-known algorithms to compute the best fitting plane, since the plane properties were assumed to be fixed in time (Szafraniec et al. 2010; Muga and Pinto 2013). However, the system performance deteriorates when such PolDemux approaches are employed in time-varying SOP scenarios due to the requirement to have the best fitting plane updated. Kalman filtering (Muga and Pinto 2015) and the geometric approach (Muga and Pinto 2013) are two possibilities to ensure adaptive computation of the best fit plane and respective normal vector.

2.1 Extended Kalman filtering

The extended Kalman filter searches adaptively for the so-called state vector, \(\mathbf {\mathrm {x}}_k\), that contains four variables,

$$\begin{aligned} \mathbf {\mathrm {x}}_k=\left( a_k, b_k, c_k, d_k \right) ^T, \end{aligned}$$
(2)

with subscript k indicating variable estimation at discrete times (Muga and Pinto 2015), and T meaning transpose. It is worth noticing that with this formulation the three first entries of the state vector are going to track the orientation of the original referential used at the transmitter side to multiplex the two tributaries signals, whereas the fourth entry is going to represent the distance of the best fit plane to the origin, which is related with the PDL of the optical channel (Muga and Pinto 2013). Then, the measurement equations regarding computation of the best fit plane can be written as follows

$$\begin{aligned} \left[ \begin{array}{c} 0 \\ 1 \\ \end{array}\right] =\left[ \begin{array}{c} a_ks_{1,k}+b_ks_{2,k}+c_ks_{3,k}+d_k \\ a_k^2+b_k^2+c_k^2 \end{array}\right] , \end{aligned}$$
(3)

where \(s_{i,k}\), with \(i=1,2,3\), defines the Stokes vector, representing the SOP of the signal sample at a discrete instant of time related with k (Muga and Pinto 2015). Since the second row of (3) contains a nonlinear expression of the state variables \(a_k\), \(b_k\), \(c_k\), this physical system has to be treated as an extended Kalman filter (Haykin 1996). The estimation of \(\mathbf {\mathrm {x}}_k\) is based on the measurement \(\mathbf {\mathrm {z}}_k\), which is defined by the following nonlinear measurement equation,

$$\begin{aligned} \mathbf {\mathrm {z}}_k = h \left( \mathbf {\mathrm {{x}}}_k\right) + \mathbf {\xi }_k, \end{aligned}$$
(4)

where \({h}(\mathbf {\mathrm {x}}_k)\) represents a function of the state vector \(\mathbf {\mathrm {x}}_k\), given by the right hand side of (3), i.e.,

$$\begin{aligned} h=\left[ \begin{array}{c} a_ks_{1,k}+b_ks_{2,k}+c_ks_{3,k}+d_k \\ a_k^2+b_k^2+c_k^2 \end{array}\right] , \end{aligned}$$
(5)

and \(\mathbf {\xi }_k\) represents the measurement noise. It is worth noticing that for linear forms of the process and measurements (which is not the case under analysis) there is a matrix \({\mathbf {H}}\) relating directly the state vector and the measurement vector \(\mathbf {\mathrm {z}}_k\), and that in such cases the term \(h\left( \mathbf {\mathrm {{x}}}_k\right)\) is replaced by \({\mathbf {H}}\mathbf {\mathrm {x}}_k\). The matrix \({\mathbf {H}}\) appears at the set of equations comprising the recursive run of the Kalman algorithm, and therefore it needs to be calculated. For the nonlinear system considered here, the matrix \({\mathbf {H}}\) can be approximated by a Jacobian matrix of partial derivatives of the functional expressions, h, with respect to the state variables (Haykin 1996), i.e.,

$$\begin{aligned} {\mathbf {{H}}}_{ij}|_{k}\equiv \frac{\partial h_i}{\partial {\mathrm {x}}_j}, \end{aligned}$$
(6)

with \(j=1,2,3,4\) and \(i=1,2\). Using (3) into (7), we obtain

$$\begin{aligned} {\mathbf {{H}}}_{ij}|_{k}=\left[ \begin{array}{cccc} s_{1,k} &{}s_{2,k}&{} s_{3,k}&{} 1 \\ 2a_k&{}2b_k&{}2c_k &{}0 \end{array}\right] . \end{aligned}$$
(7)

Equations (2)–(7) completely define the extended Kalman filter and the recursive algorithm can now be applied (Haykin 1996).

2.2 Geometric approach

The geometrical approach method proposed in Muga and Pinto (2014) works as follows. An initial orientation of the normal, \(\hat{n} (0)\), along with the Stokes vector representing the SOP of the upcoming sample, \(\hat{s} (1)\), are used to adaptively compute the next normal \(\hat{n} (1)\). This vector is used to compute \({\mathbf {F}}\), according to (1), which is then used to calculate \([X_{out}(1)\), \(Y_{out}(1)]\). Then, the vector \(\hat{n}(i+1)\) is updated, sample by sample, according to the algorithm

$$\begin{aligned} \hat{n}(i+1) =\frac{\hat{n}(i)+\mathbf {\Gamma }(i+1)}{\Vert \hat{n}(i)+\mathbf {\Gamma }(i+1)\Vert }, \end{aligned}$$
(8)

where \(\mathbf {\Gamma }(i+1)\) is a vector lying in the 3D Stokes plane defined by vectors \(\hat{n}(i)\) and \(\hat{s} (i+1)\), whose direction is perpendicular to \(\hat{s} (i+1)\), i.e. \(\mathbf {\Gamma }(i+1)\perp \hat{s} (i+1)\). This definition forces the orientation of \(\hat{n}(i+1)\) to become tendentiously orthogonal to the Stokes vector \(\hat{s} (i+1)\). Mathematically, the orientation of \(\mathbf {\Gamma }(i+1)\) is made perpendicular to \(\hat{s} (i+1)\) by making it parallel to the triple cross product \((\hat{s}\times \hat{n})\times \hat{s}\). Then, \(\mathbf {\Gamma }(i+1)\) can be written as

$$\begin{aligned} \mathbf {\Gamma }(i+1) = \eta A \left( [\hat{s}(i+1)\times \hat{n}(i)]\times \hat{s}(i+1)\right) , \end{aligned}$$
(9)

where \(\eta\) is the step-size parameter, and A is given by the dot product between \(\hat{s}\) and \(\hat{n}\).

Fig. 1
figure 1

Schematic diagram of the system used in the numerical simulations. A (polarization multiplexing coherent transmitter PolMux Co-Transmitter) is used to generate a PDM-QPSK signal that is launched in a optical recirculating loop. A polarization- and phase-diversity coherent receiver (Co-Receiver) collects the signal at each turn, which is after processed at the DSP subsystems

3 Assessment of PMD tolerance

3.1 Simulation configuration

To assess the PMD tolerance of SSPolDemux techniques, we conduct numerical simulations for a PDM-QPSK coherent system, as schematically shown in Fig. 1. At the dual-polarization coherent transmitter, a 112 Gbit/s PDM-QPSK signal, which contains two 28 Gbaud/s orthogonal polarization QPSK signals, is generated and is then launched into an optical recirculating loop composed of one 100 km span of single mode fiber (SSMF). Fiber parameters are the following: fiber attenuation of 0.23 dB/km; fiber dispersion of 16\(\,\times\,10^{-6}\) s/m\(^2\), with zero dispersion slope; polarization-mode dispersion coefficient, \(D_p\), took values within the range 0 and 0.5 ps/km\(^{1/2}\). Fiber nonlinearities were neglected. After the fiber span, an EDFA, with 3 dB noise figure, compensates for the total span loss of 28 dB. It is worth notice that a 3 dB noise figure coincides with the theoretical limit value; however, the strictly usage of noise figures close to typical values was not considered a critical point in this simulation as the most import goal is to assess the limits of the SSPolDemux technique regarding PMD. An ideal coherent receiver collects the PDM-QPSK signal at each turn that is therefore processed in the different DSP subsystems. The DSP starts with the downsampling to 2 samples per symbol (SpS). The linear propagation impairments (chromatic dispersion) are then fully compensated by a static equalizer working in the frequency domain, which is followed by an adaptive equalizer for PolDemux. In this work, we have employed a SSPolDemux equalizer (considering both approaches discussed in Sect. 2), whose performance is compared with the well known 2\(\,\times\,\)2 constant modulus algorithm (CMA) (Savory 2010). After PolDemux, frequency and carrier phase estimation (CPE) is applied adapting the algorithm for each technique.

Fig. 2
figure 2

EVM of the different PolDemux techniques and as function of turns in the optical loop, considering different values of samples per symbol

3.2 Numerical results

In order to accurately assess the impact of PMD on the performance of the SSPolDemux we have computed the error vector magnitude (EVM) between the processed signal and the respective ideal constellation for different propagation conditions. Total EVM was calculated as the average of the two polarization tributaries, each one calculated as follows

$$\begin{aligned} EV\!M_{\text {rms}}=\left[ \frac{\sum \nolimits ^{N}_{r=1}\left| {E}_{\mathrm {id},r}-{E}_{\mathrm {rec},r}\right| ^2}{\sum \nolimits ^{N}_{r=1}\left| {E}_{\mathrm {id},r}\right| ^2}\right] ^{1/2}, \end{aligned}$$
(10)

where \({E}_{\text {rec},r}\) is the \(r{\mathrm {th}}\) received symbol, \({E}_{\text {id},r}\) represents the ideal constellation point for the \(r{\mathrm {th}}\) symbol, and N stands for the number of symbols (Schmogrow et al. 2012).

Previous works on SSPolDemux considered input signals with just one sample per symbol (SpS) (Muga and Pinto 2013, 2014, 2015), without further analyses on the impact of this parameter on the technique performance. We start our investigations addressing this point in a scenario where the fiber PMD is made null (polarization changes along propagation were limited to static rotations). Figure 2 shows the computed EVM as a function of the number of loops, i.e., the propagation distance, considering the PolDemux of signals with 1 SpS and with 2 SpS. In both cases, results show a similar performance for the CMA and the SSPolDemux techniques (Kalman and Geometric). This shows that the SSPolDemux techniques can also benefit from an higher sampling rate (a gain of more than 5 dB was observed within the considered range). In the case of 2 SpS, CMA PolDemux with 3 taps shows a smoothest evolution with the number of loops, revealing that some residual impairments, other than polarization rotations, are also being equalized when using this filter.

Fig. 3
figure 3

EVM of the different PolDemux techniques as function of the ratio DGD/TSymbol, considering different values of PMD

The performance of the PolDemux techniques while in presence of PMD was accomplished considering three different values for this parameter and, consequently, considering three different values for the ratio DGD/TSymb at the end of each optical recirculating loop. At the end of N recirculations, the averaged DGD is given by

$$\begin{aligned} {\mathrm {DGD}} = D_p \sqrt{ L_{span}\times N}, \end{aligned}$$
(11)

where \(D_p\) is the PMD coefficient and \(L_{\mathrm {span}}\) is the fiber span length. Figure 3 depicts the system EVM as a function of the ratio DGD/TSymb, considering different PMD values (0.1, 0.3, and \(0.5\hbox { ps}/\hbox {km}^{1/2}\)) and different PolDemux techniques. The results show that for the lowest values of DGD/TSymb the SSPolDemux is obtained without any penalty (when compared with the CMA). For instance, the set of results corresponding to \(0.1\,\hbox { ps/km}^{1/2}\) show that all three techniques have the same EVM for the different values of DGD/TSymb. However, when this ratio assumes larger values, the SSPolDemux techniques perform worse than the CMA PolDemux (the CMA results presented in Fig. 3 were obtained using 3 taps in the algorithm). Notice that the maximum EVM for the CMA PolDemux is approximately the same for the three different values of PMD, and equals the maximum value presented in Fig. 2 (zero PMD), which means that CMA fully compensates for this effect in the scenarios under analysis. Regarding the Stokes space based techniques, the increased penalties are associated with the rotations of the signal samples in the Stokes space: for larger values of DGD/TSymb the signal samples experience different rotations, inducing degradations on the lens-like object (Szafraniec et al. 2010). Since this lens-like object is mandatory to find out the symmetry plane whose normal is used to compute the channel inverse matrix (given by (1)), the SSPolDemux becomes critically inefficient for the largest values of DGD/TSymb.

Fig. 4
figure 4

Stokes parameters pdf for the CMA PolDemux: a after the first loop recirculation and; b after de 20th recirculation

Fig. 5
figure 5

Stokes parameters pdf for SSPolDemux: a after the first loop recirculation and; b after de 20th recirculation

The discussed SSPolDemux inefficiency can be understood through the analysis of the evolution of the probability density function (pdf) of each Stokes parameter. Figures 4 and 5 show the pdf of the Stokes parameter for the CMA and SSPolDemux, respectively, after 1 and 20 recirculations, sub-figures a and b, respectively. After 1 recirculation, we observe, for both cases, that the Stokes parameters pdfs present well defined peaks, corresponding to the four different state-of-polarization (SOP) clouds observable in the Poincaré sphere (represented in the sub-figures as inset). Such SOPs lie along the plane \(s_2-s_3\), given rise to three peaks on the pdf of \(s_2\) and \(s_3\) parameters. Notice that the \(s_1\) pdf has only one peak centered around zero because all the four SOPs clouds have the same average \(s_1\) value. After 20 recirculations, despite the accumulated noise, the results for the two PolDemux techniques are substantially different, with 3 well identified peaks in the \(s_2\) and \(s_3\) pdfs for the CMA (see Fig. 4b), and with a single peak in the \(s_2\) and \(s_3\) pdfs for the SSPolDemux (see Fig. 5b). These results, along with the Poincaré sphere and In-Phase and Quadrature (IQ) constellation plots, represented as insets in Figs. 4 and 5, reveal that Stokes PolDemux fails due to the spread of points induced by the PMD. Due to its intrinsic geometric nature (the inverse matrix is computed from the best fit plane), this technique cannot compensate for it and is its performance is affected by the different rotations in the Poincaré sphere of the different signal samples.

Fig. 6
figure 6

EVM penalty of the SSPolDemux techniques as function of the ratio DGD/TSymb. The penalty is computed taking the respective CMA values as the reference value. Vertical-dashed lines identify the regimes boundaries of the Stokes PolDemux performance discussed in the text

4 Discussion and conclusions

The analysis of results presented in the previous section allows to identify three different regimes regarding the PMD tolerance of the SSPolDemux. Such regimes can be observed in Fig. 6, where the SSPolDemux EVM penalty is plotted as a function of the ratio DGD/TSymb. The SSPolDemux EVM penalty is defined as the diference between the SSPolDemux EVM and the CMA EVM. Notice that in order the use the CMA EVM as a reference value, we are assuming that CMA fully compensates for the PMD. From Fig. 6, we conclude that in the first regime, occurring for DGD/TSymb smaller than 0.2, the SSPolDemux is accomplished without significant penalties (smaller than 1 dB). In the second regime, occurring for values of DGD/TSymb within the range [0.2–0.4], the SSPolDemux is possible, at an expense of considerable penalties. Finally, in the third regime, occurring for values of DGD/TSymb larger than 0.4, the SSPolDemux cannot be performed.

As mentioned in the first part of this paper, SSPolDemux is an alternative approach that can be employed in specific scenarios, with particular advantages when compared with other counterpart techniques. The transparency to higher-order modulation formats is certainly one of the most important advantage. In this context, although the results obtained in this paper were obtained for a QPSK signal, the main conclusions can in principle be extended to higher-order modulation formats. Notice that the lens-like object resulting from the Stokes space representation is an intrinsic property of PolMux M-ary signals. Regarding the fiber nonlinearities, we should emphasize that coherent access/metro networks are the most evident and potentiating application scenario for SSPolDemux, which means that its impact on the SSPolDemux can be neglected. As a final remark, it important to mention that CD was fully compensated before the PolDemux. In contrast to CMA PolDemux, SSPolDemux cannot compensate for any residual CD, which means that larger values of residual CD may affected the performance of the technique.