1 Introduction

A standard approach for measuring the energy of an X-ray photon is to apply a digital optimal filter [1]. Optimal filtering provides the most accurate estimate of photon energy only if the noise does not change during the pulse (stationary noise), and if the signals from different energies have the same pulse shape (linear response). In certain cases for our TESs, these conditions may not be met [2] and the energy resolution can thus be significantly degraded. We present an implementation of an approach discussed in previous work [35], based on principal component analysis (PCA). The goal of this is to improve energy resolution in the non-linear, non-stationary regime.

For our type of detector, PCA uses the eigenvectors of a pulse-pulse covariance matrix, where the eigenvectors are the principal components of the information in the signal. It enables the weighting of different parts of the signal within a pulse at different times, according to the ratio of signal-to-noise of the different components during different times. For the case of stationary noise and a linear response with energy, it should be equivalent to the conventional optimal filter. This provides a more robust estimate of the photon energy under non-linear, non-stationary conditions. The successful implementation of such an algorithm will enable use of TES detectors with, for example, low heat capacity, with improved energy resolution over a large range of energies. This approach also extracts information about (and may enable correction for) event-to-event variations in the pulse shape that may arise due to trigger-jitter, position-dependence of the X-ray absorption producing a rise-time variation, or gain drift due to changing environment conditions such as temperature or magnetic field. An example of the pulse height changing with baseline level is shown in Fig. 1.

Fig. 1
figure 1

This plot shows a correlation between optimally filtered pulse height and baseline level. The results are from a TES with X-rays originating from an \(^{55}\)Fe source. The line is a linear fit to the data. Such correlations motivate us to consider that other properties of pulses are changing (Color figure online)

2 Principal Component Analysis

For our experiment, we measure an ensemble of pulses from an \(^{55}\)Fe source. We select out single-pulse records (not piles-up pulses) around the MnKa line for the following analysis, with no separate noise traces included. The fraction of piled-up pulses was small, less than 5 %. We create a variance–covariance matrix of each pulse, essentially creating the matrix from the product of each point in a trace with each other point. For a set of X-ray events, we define a matrix x; the matrix elements are \(x_{ij}\) where each element represents the jth pulse at the ith time in that pulse. We then create a mean pulse \((1/M) \Sigma _{j}x_{ij} = <x_i>\), where M is the number of pulses. The mean pulse is used to create D, the matrix of the residuals of each pulse from the mean: \(\mathbf{D}_{ij} = x_{ij} - <x_i>\). We then calculate the covariance of the residual matrix \(cov(\mathbf{D})=\mathbf{D}\mathbf{D}^\mathrm{{T}}\). An example of the covariant matrix is depicted in Fig. 2, and the eigenvalues from this matrix is shown in Fig. 3.

Fig. 2
figure 2

Top Covariance matrix that is formed from the ensemble of pulses. The units of the color-coded legend in amperes\(^2\). Bottom An example of current pulse from a TES. The abscissa represents time (sampling rate is 1 Ms/s). The vertical axis measures current through the TES (Color figure online)

Fig. 3
figure 3

The spectrum of the largest 50 eigenvalues of our covariance matrix (Color figure online)

We now want to transform the data to a basis where the covariance matrix is diagonal. We, therefore, determine the eigenvectors of the covariance matrix \(cov(\mathbf{D})=\mathbf{Q}\Lambda \mathbf{Q}^{-1}\), where \(\Lambda \) is the diagonal matrix of eigenvalues sorted from highest to lowest eigenvalues, and \(\mathbf{Q}\) is the matrix of eigenvectors. We then rotate each pulse into the basis of eigenvectors \(\mathbf{R} = \mathbf{Q}^\mathrm{{T}}\mathbf{D}\). The fundamental task at hand is to determine which eigenvectors of the average covariance matrix are responsible for the properties of the pulse with the greatest photon energy information. We identify these as those eigenvectors with the largest eigenvalues in the covariance matrix. The logarithm of the eigenvalue is a measure of the information in the pulse of the corresponding eigenvector.

Fig. 4
figure 4

Left column shows normalized eigenvectors plotted against time. The distribution in this panel helps us to identify what aspect of the pulse each vector is associated with. Right column shows the projection of all pulses onto the corresponding (left) eigenvector (Color figure online)

In Fig. 4, we show an example of this procedure. The third and fourth eigenvectors (left column) have a shape that is reminiscent of the pulse shape. The components of each pulse in our dataset, projected onto these eigenvectors (right column of Fig. 4), show a distribution similar to the pulse height distribution we expect from the Mn K\(\alpha \) 1 and 2 lines. The value of eigenvector 3 is completely negative, and conversely, eigenvector 4 has points both positive and negative. These features indicate that eigenvector 4 is likely more sensitive to the pulse height, and eigenvector 3 is more sensitive to the baseline, analogous to that shown in Fig. 1. In the first and second eigenvectors, most of the information is coincident with the pulse arrival. Also we find non-Gaussian distributions of the rotated projections, as is clearly evident in the right panels of vectors 1 and 2 in Fig. 4. Structures like this are indicative of arrival time variations. The fact that the leading eigenvector does not look like a pulse is not problematic. This purely implies that for this dataset, most of the information available is related to arrival time. Our PCA method independently determines corrections that we typically use in our standard optimal filtering processing method, but without a priori knowledge of these variations, or being limited to those assumptions.

Fig. 5
figure 5

The projection of each pulse onto eigenvector 3 is plotted against the projection of each pulse onto eigenvector 4. The arrows indicate, as labeled, the constant energy and the perpendicular delta E vector. We also label the two Mn K\(\alpha \) lines (Color figure online)

In Fig. 3, we show the spectrum of eigenvalues. It is clear that the vast majority of the magnitude in the covariance matrix is coming from the first three or four eigenvectors. We therefore designate these as our principal components. Thus, for this example, it has been determined that each X-ray can now be described by four numbers, which are the first four points in the eigenvector basis. With the knowledge of these principal components of the pulse, the next step is understanding how these components or their combination correlate with energy. This allows us to determine more accurately how the information from the covariance matrix can be used to improve the characterization of photon energy.

Figure 5 shows projection 3 versus projection 4, that is to say we take the data from the right panel of eigenvector 3 (in Fig. 4) and plot it against the data from the right panel of eigenvector 4. From observation of the correlations in this plot, we determine that the two over-densities of points trace diagonal lines which lie in the vector direction of constant energy. This indicates that they are consistent with the two Mn K\(\alpha \) lines. We can therefore optimize the energy resolution of our system by determining what combination of these eigenvectors gives the maximum separation in the Mahalanobis [6] coordinate space, a space that provides a relative measure of a data point’s distance from a common point. When the eigenvectors are the basis set, as is the case here, then the Mahalanobis distance is simply the Euclidian distance. We illustrate this maximization of the separation with two arrows. The constant E vector represents the direction of constant energy, and the delta E vector represents the perpendicular direction. Since we know the energy difference between these two lines, we can use this information to calibrate our gain scale. When a broad spectrum of energies needs to be measured, including regions of continuum, it is necessary to calibrate the pulse response as a function of energy in separate measurements, such that the increments are small enough that pulse-shape variations can be accurately approximated by interpolation. A study of the response of a detector as a function of energy, such as from the 3 eV separations that can occur when absorbing pulsed photons from a laser diode [7], could be ideal for this. This type of calibration allows us to know the direction of increasing energy, and the surfaces of constant energy. In general, this analytic approach rigorously optimizes the energy resolution from the study of one projection if the variation in pulse data is purely due to energy deposited in the detector. When there are other contributions to the pulse variation, such as in the dataset here, then it is more complicated to determine the optimal combination of projections, and a rigorous generalized procedure will require further research.

Fig. 6
figure 6

(left) Al K\(\alpha \) spectrum determined using a standard optimal filter. (right) For the same device, Mn K\(\alpha \) spectrum processed using a standard optimal filter. The red lines indicate data, the center indicates the number of counts, and the length of the line indicates the statistical error. The light blue line represents the intrinsic line shape, and dark blue line represents a fit to the data and is the convolution of the intrinsic line shape with a Gaussian of the FWHM stated above each panel. Bottom panels show residuals from the fit (Color figure online)

Fig. 7
figure 7

(left) Pulse shapes for Al K\(\alpha \) and Mn K\(\alpha \), illustrating a saturated response for the higher energy. (right) Mn K\(\alpha \) spectrum processed using the PCA technique, showing dramatic improvement in FWHM. This spectrum uses the same data as is shown in Fig. 6 (right) (Color figure online)

3 PCA Method Improves Measured Resolution

Here we compare the results of PCA analysis with the basic optimal filtering on one particular dataset [2]. This device demonstrated a full-width at half maximum (FWHM) of 0.9 eV at 1.5 keV, using optimal filter for the processing (Fig. 6 left). For the same device and processing technique, but constructing the filter based upon the 6 keV pulse shape and assuming a linear response, we achieve a FWHM of 3.2 eV at 6 keV (Fig. 6 right). Figure 7 left shows the two pulse shapes for the different energies, illustrating a saturated response for the higher energy. Figure 7 right uses the same 6 keV data but processing with PCA. In this example, the PCA method demonstrates a factor of two times improvement over optimal filter techniques, yielding a FWHM of 1.6 eV. We note that this result is the best case that we have measured. The level of improvement that is observed will depend upon how much the assumptions of optimal filter techniques are violated. Because one does not know a priori if the assumptions required for optimal filter methods are violated, methods, such as ours, that provide greater flexibility seem to be necessary to recover the best energy resolution. For example, reprocessing the 1.5 keV data with the PCA method did not improve the measured resolution.

4 Conclusion

We have demonstrated that the use of a PCA method to analyze data from X-ray pulses on TESs has the capacity to significantly improve the energy resolution. We outlined how this procedure can be implemented. To make this procedure broadly useful, it will be necessary to develop prescriptions for selecting an appropriate number of principal components, and for weighting the components to provide the best possible energy estimator. Future efforts will be put into determining how to automate these methods, and thus facilitate implementation in real instruments, and how it would be implemented in a real-time processing situation. This general approach has great potential for yielding improved spectroscopy in the future.