1 Introduction

Photometric stereo is effective for the detailed recovery of three-dimensional (3D) surfaces. Classical photometric stereo methods, originally proposed by Woodham (1980) and Silver (1980), use images captured from a fixed camera under varying lighting directions, which are commonly obtained at different timestamps. Since conventional photometric stereo methods stack grayscale or RGB imagesFootnote 1 with time-multiplexing, the target surface has to be static during the multiple shots, and the spectral property of the material is omitted in the estimated reflectance.

Fig. 1
figure 1

Our multispectral photometric stereo setup with 12 narrow-band spectral LEDs placed at different locations. Taking the spectral image observations as input, our method outputs a closed-form unique solution of both surface normal and spatially-varying spectral reflectance for heritage preservation

With multispectral photometric stereo (MPS) Kontsevich et al. (1994), detailed 3D shapes and the corresponding spectral reflectances can be jointly recovered from a one-shot multispectral image via spectral-multiplexing. It is useful not only for obtaining the object’s shape for digital heritage preservation Miyazaki et al. (2010) but also for studies based on spectral analysis, such as artwork material identification Picollo et al. (2020) and revealing underdrawings of oil paintings Hain et al. (2003). However, unlike conventional time-multiplexing photometric stereo, MPS remains an ill-posed problem even with a Lambertian assumption. In this paper, we propose a method to make the problem tractable and show that a unique solution can be obtained even for scenes with spatially-varying reflectances. With the proposed method, we assess its potential applicability to digital heritage preservation.

An input image for MPS encodes observations under different lighting directions in different spectral bands, conveying the information about the surface normals and spectral reflectances. Figure 1 shows our MPS setup, which contains a camera and 12 narrow-band spectral light sources located at different positions. From the input spectral image observations, our goal is to estimate both object shape and spectral reflectance simultaneously. However, under the illumination of f spectral lights, there are \(f+2\) unknowns (f for the reflectance of the spectral bands, and 2 for the surface normal). Since only f observations for each scene point are given, MPS is inherently under-constrained.

To make the problem tractable, existing methods use additional priors, e.g., initial shape Anderson et al. (2011a, 2011b), trained neural networks Ju et al. (2018, 2020a, 2020b), or local smoothness regularization Miyazaki et al. (2019). However, these priors are rather restrictive and may not always comply with the actual scene. Without these priors, existing methods Ozawa et al. (2018); Chakrabarti and Sunkavalli (2016); Silver (1980) provide a unique solution for MPS by assuming the surface spectral reflectance types (SRT) to be gray chromatic or monochromatic with uniform albedo (SRT I and II in Fig. 2). However, these spectral reflectance assumptions are also restrictive for real-world scenes. As shown in Ozawa et al. (2018), Chakrabarti and Sunkavalli (2016), incorrect surface normals are estimated at surface regions with roughly constant chromaticity but continuously changing albedos. Previous methods Ozawa et al. (2018); Chakrabarti and Sunkavalli (2016) also investigated MPS for spatially-varying reflectance (SRT IV) with the relaxation of piece-wise constant chromaticities and albedos. However, the spatial clustering of the uniform spectral reflectance regions is not only cumbersome but also fragile to the outliers, such as shadows and specular highlights.

Fig. 2
figure 2

Visualization of four spectral reflectance types (SRT) categorized by the spatial distribution of the chromaticity \(C(\lambda )\) and the albedo \(\rho \). The color maps provide spatial distribution examples of chromaticities and albedos for each SRT in the RGB space. Solid and hollow dots show the spectral reflectances of two scene points at f wavelengths \(\mathbf {r} = [R(\lambda _1), \cdots , R(\lambda _f)]\). This paper presents unique and closed-form solutions for both SRT III and IV

In this paper, we make MPS to work well under spatially-varying spectral reflectances. Given a multispectral image under calibrated lighting directions, we first provide a closed-form MPS solution for surfaces with uniform chromaticity but spatially-varying albedos (SRT III in Fig. 2), without relying on any additional priors. We further extend our method to deal with the surface with spatially-varying chromaticities and albedos (SRT IV in Fig. 2) by additionally calibrating the light spectra and camera spectral sensitivity.

Specifically, for SRT III surfaces, we treat the estimation of spectral reflectance and surface normal as a bilinear optimization problem. We show that the problem can be turned into a homogeneous system of linear equations, where the surface normal and spectral reflectances are jointly estimated. Given observations of SRT IV surfaces and calibrated light spectra and camera spectral sensitivity, we show that closed-form solutions for both surface normal and spectral reflectance are given in a per-pixel manner. We achieve this by expressing the spectral reflectance with linear bases, which are extracted from a material database of bidirectional reflectance distribution functions (BRDFs) Dupuy and Jakob (2018). Unlike previous methods that are restricted to 3 spectral channels Ozawa et al. (2018); Chakrabarti and Sunkavalli (2016); Anderson et al. (2011a, 2011b); Ju et al. (2018), our method allows the use of arbitrarily many spectral channels. As a side-bonus of this input property, we can also rely on the off-the-shelf four or more source photometric stereo methods to deal with outliers, such as shadow and specular highlights, making our methods for both SRT III and IV more robust than existing RGB-based MPS methods.

To summarize, the primary contributions of our work are as follows.

  • We show that MPS for monochromatic surfaces with spatially-varying albedos (SRT III) can be solved in a closed-form without introducing any external priors, and we derive the minimal conditions based on the number of spectral lights and scene points for the problem to have a unique solution.

  • We introduce a basis representation for the spectral reflectance and present a closed-form MPS solution for surfaces with spatially-varying chromaticities and albedos if the light spectra and camera spectral sensitivity are calibrated.

  • Our methods for both SRT III and SRT IV are robust to outliers, such as shadows and specular highlights, because of its capability of applying robust estimation thanks to that our method can take arbitrary many spectral channels as input.

The preliminary version of this work appeared in Guo et al. (2021) (denoted as “Ours\(_{\mathrm{III}}\)”), which solves MPS for surfaces with a uniform chromaticity but spatially-varying (SV) albedos (SRT III) without additional priors. However, this spectral reflectance type is still limited to handle the general spectral reflectance in the real scene. Therefore, this paper extends Guo et al. (2021) by providing a unique and closed-form MPS solution (denoted as “Ours\(_{\mathrm{IV}}\)”) for surfaces with more general SV-chromaticities and albedos (SRT IV). To demonstrate the effectiveness of our new approach, additional experiments on both synthetic and real data are also presented. Specifically, in Sec. 4, we present a new formulation to make MPS under SRT IV well-posed and convex by introducing a linear basis representation of the inverse spectral reflectances. In Sec. 5, we update the experiments on synthetic data rendered with realistic reflectances to demonstrate the effectiveness of our methods on both SRT III and SRT IV surfaces. In Sec. 6, we evaluate our methods on real captured images of statues and reliefs. In this way, we explore the potential applicability of our MPS method of both SRTs on heritage preservation.

Table 1 Comparison of MPS methods

2 Related Works

As described in previous works Hernández et al. (2010); Vogiatzis and Hernández (2012), the material spectral reflectance \( R(\lambda ): {\mathbb {R}}_{+} \rightarrow {\mathbb {R}}_{+}\) can be decomposed into two parts: Chromaticity \(C(\lambda ) : {\mathbb {R}}_{+} \rightarrow {\mathbb {R}}_{+}\) and albedo \(\rho \in {\mathbb {R}}_{+}\), such that \(R(\lambda ) = C(\lambda )\rho \), where \(\lambda \) represents wavelength. As shown in Fig. 2, based on the spatial distribution of chromaticity and albedo for a surface, we categorize 4 different surface spectral reflectance types (SRT) and order them in a way from simple to complex. In this section, we introduce existing methods based on their assumptions on SRT and list their properties for the comparison in Table 1.

SRT I If the surface has gray chromaticity, i.e., the chromaticity remains constant w.r.t. varying wavelength, MPS is identical to classical photometric stereo. Therefore, given 3 or more spectral bands, a closed-form solution for surface normal can be obtained without ambiguity Silver (1980).

SRT II For monochromatic surfaces with uniform albedo, i.e., all the scene points share the common chromaticity \({\tilde{C}}(\lambda )\) and albedo \({\tilde{\rho }}\), previous methods Drew and Kontsevich (1994); Kontsevich et al. (1994) show that the surface normal can be estimated from a single RGB image up to a rotation ambiguity. The correct rotation was approximated by imposing an additional integrability condition. Hernández et al. (2007) establish a one-to-one linear mapping between pixel measurements and surface normals to reconstruct the deformable cloth shape. This unknown linear mapping is calibrated via a planar board with a cloth sample fixed in the center. If the crosstalk between spectral channels is negligible, existing methods Chakrabarti and Sunkavalli (2016); Ozawa et al. (2018) provide a unique solution for surface normals. However, their methods are restricted to RGB 3-channel input and cannot be expanded to more channels (see the “Appendix”).

SRT III Few methods focus on the monochromatic surfaces with spatially-varying albedos, which is commonly seen in natural objects (e.g., wood and rocks) and human skins. Vogiatzis and Hernánde (2012) assume the spectral reflectance of the human face follows SRT III and obtain detailed reconstructions of faces in real-time. However, their surface normal estimation results rely on the accuracy of initial geometry and detection of equal-albedo pixels.

SRT IV If the chromaticity and albedo are both spatially-varying, MPS from a single multispectral image is ill-posed. Existing methods apply additional regularizations and provide numerical solutions for MPS. Chakrabarti and Sunkavalli (2016) and Ozawa et al. (2018) relax the spatially-varying spectral reflectance to be piece-wise constant. Since their methods are both based on 3-channel RGB inputs, they discretize the spectral reflectance in a 3D space to cluster pixels with equal chromaticities and albedos so that they can turn the problem into a set of SRT II subproblems. The normal map is then estimated in each surface region that is predicted as having the same spectral reflectance. The method by Anderson et al. requires a coarse shape from depth map Anderson et al. (2011a) or stereo pairs Anderson et al. (2011b) and uses it to guide the chromaticity segmentation and the surface normal estimation. Similar to Chakrabarti and Sunkavalli (2016), Ozawa et al. (2018), the piece-wise constant spectral reflectance assumption restricts the flexibility of the target surface’s reflectance. The normal estimation accuracy is also influenced by the errors introduced by the reflectance clustering step.

Some recent methods directly take an RGB image as input and apply deep neural networks to predict the surface normal Ju et al. (2018, 2020a); Antensteiner et al. (2019). However, the lighting directions are required to be consistent between the training and test procedures. Miyazaki et al. (2019) recover surface normals from a multispectral image with more than three channels. However, their recovered shape tends to be over-smoothed due to the spatial smoothness assumption on both surface normal and the reflectance. Fyffe et al. (2011) assume the spectral reflectance lies in a low-dimensional space and represent it with a statistical basis set. However, their spectral reflectance bases are scene-dependent and need to be calibrated with the known surface normal and reflectance pairs. Besides, the optimization of this method is non-convex and requires a good initialization.

Our method Taking a multispectral image with an arbitrary number of channels as input, we first formulate MPS for monochromatic surfaces with spatially-varying albedos (SRT III) as a well-posed problem, and estimate surface normal without introducing external priors Guo et al. (2021). We further show that MPS under SRT IV can be made tractable if the light spectra and camera spectral sensitivity are calibrated. Different than existing works Chakrabarti and Sunkavalli (2016); Ozawa et al. (2018); Anderson et al. (2011b), we avoid both the piece-wise uniform spectral reflectance restriction and the reflectance clustering steps by introducing a basis representation of the per-pixel spectral reflectance. Compared with Fyffe et al. (2011), our formations for both SRT III and SRT IV are convex, and our extracted spectral reflectance bases are shown to be scene independent based on the real data experiments.

3 MPS for Surfaces with SRT III

Given a multispectral camera with a linear radiometric response and f (geometrically but not spectrally) calibrated spectral directional lights, we capture a multispectral image of p scene points on a Lambertian surface by turning on all the spectral lights. If the crosstalk between spectral bands is negligible, i.e., the observation under each spectral light is only observed in its corresponding camera channel, observations \(\mathbf {m}_i \in {\mathbb {R}}_+^{f}\) for the i-th pixel can be written as follows

$$\begin{aligned} \mathbf {m}_i = \mathrm{diag}(\mathbf {t}_i) \{\mathbf {L}\mathbf {n}_i\}_+, \end{aligned}$$
(1)

where \(\mathbf {n}_i \in S^2 \subset {\mathbb {R}}^3\) represents the unit surface normal vector, \(\mathbf {L} \in {\mathbb {R}}^{f \times 3}\) stacks all the light directions. We use \(\mathrm{diag}(\cdot )\) as a diagonalization operator and \(\{\cdot \}_+\) as a non-negative operator, which accounts for attached shadows. For simplicity, we omit this operator \(\{\cdot \}_+\) in the following explanation. Here, \(\mathbf {t}_i \in {\mathbb {R}}_{+}^{f}\) is related to the camera spectral sensitivity, light source spectra and the surface spectral reflectance at f spectral bands. Its element follows

$$\begin{aligned} t_{ij}= \int _{\lambda \in \varOmega _j} E_j(\lambda )R_{i}(\lambda ) S_j(\lambda ) d\lambda , \end{aligned}$$
(2)

where \(\varOmega _j\) is the wavelength range of the j-th spectral band, \(E_j(\lambda ): {\mathbb {R}}_+ \rightarrow {\mathbb {R}}_+\) denotes the spectra of the j-th light, \(S_j(\lambda ): {\mathbb {R}}_+ \rightarrow {\mathbb {R}}_+\) defines the camera spectral sensitivity at j-th channel, and \(R_i(\lambda ): {\mathbb {R}}_+ \rightarrow {\mathbb {R}}_+\) is the material spectral reflectance of the i-th scene point. The problem of general MPS is to estimate \(f+2\) unknowns including \(\mathbf {t}\) and surface normal \(\mathbf {n}\) from f-element measurement vector \(\mathbf {m}\), which is unfortunately an ill-posed problem.

We turn the MPS to be well-posed by assuming the surface following SRT III: The material spectral response can be decomposed into a uniform chromaticity \({\tilde{C}}(\lambda )\) and spatially varying albedos \(\rho _i\), such that

$$\begin{aligned} R_{i}(\lambda ) = \rho _i {\tilde{C}}(\lambda ). \end{aligned}$$
(3)

Combing Eqs. (2) and (3), we rewrite the spectral image observations for a scene point of the SRT III surface as

$$\begin{aligned} \mathbf {m}_i = \mathrm{diag}(\mathbf {q})\rho _i\mathbf {L}\mathbf {n}_i, \end{aligned}$$
(4)

where \(\mathbf {q} \in {\mathbb {R}}^f_+\) is the uniform reflectance devoid of spatially-varying albedos, whose elements are

$$\begin{aligned} q_{j}= \int _{\lambda \in \varOmega _j} E_j(\lambda ){\tilde{C}}(\lambda ) S_j(\lambda ) d\lambda . \end{aligned}$$
(5)

With the uniform chromaticity \({\tilde{C}}(\lambda )\), \(\mathbf {q}\) remains constant over the surface since both light spectra and camera spectral sensitivity are independent of the scene points. With the surfaces of SRT III, we found the minimum conditions to yield a unique MPS solution for surface normal are as follows.

Theorem 1

Given f spectral observations under varying lighting directions of p scene points known to share the same chromaticity \({\tilde{C}}(\lambda )\), their surface normals can be uniquely determined if either one of the minimal conditions for the number of lightings and pixels is satisfied:

  • Minimal pixel condition (MPC): \(p=2, f \ge 5\),

  • Minimal lighting condition (MLC): \(f=4, p \ge 3\).

In other words, if two scene points share the same chromaticity but varying surface normals, their surface normals can be uniquely determined given 5 or more lighting directions. On the other hand, if we know 3 or more scene points sharing the same chromaticity and their surface normals are non-coplanar, we can recover their normal directions with 4 or more spectral light sources. In the following subsections, we present the unique solution for SRT III and provide the proof for minimal solvable conditions MPC and MLC.

3.1 Unique Solution for SRT III

Suppose a surface with p scene points sharing the same chromaticity, by representing all pixels and lighting directions in a matrix form, we rewrite Eq. (4) as

$$\begin{aligned} \mathbf {M} = \mathbf {Q}\mathbf {L}\mathbf {N}^\top \mathbf {P}, \end{aligned}$$
(6)

where \(\mathbf {Q} = \mathrm{diag}(\mathbf {q})\) is an \(f \times f\) diagonal matrix, \(\mathbf {M} \in {\mathbb {R}}_{+}^{f \times p}\) records the image observations of p scene points under f lights, \(\mathbf {N} \in {\mathbb {R}}^{p \times 3}\) stacks all the surface normals in a row-wise manner, \(\mathbf {P}\) is a \(p \times p\) diagonal matrix with its diagonal element defined by pixel-wise spatially-varying albedos.

The above spectral image formation model has a similar structure with semi-calibrated photometric stereo (SCPS) Cho et al. (2018). However, the task and physical image formation model between SCPS Cho et al. (2018) and our method for SRT III are different. SCPS Cho et al. (2018) denotes \(\mathbf {q}\) as light intensities and aims at solving conventional photometric stereo without calibrating the light intensity, whereas ours focuses on the use of relatively general reflectance assumption (SRT III) and multispectral image cues to formulate MPS as a well-posed problem without additional priors. The unknown \(\mathbf {q}\) in our method encodes the integral of the light spectra, camera spectral sensitivity, and the chromaticity shared by the scene points, as shown in Eq. (5), which is different from the light intensity notation in SCPS Cho et al. (2018).

Given image observations \(\mathbf {M}\) and the calibrated lighting directions \(\mathbf {L}\), we recover uniform reflectance devoid of albedos \(\mathbf {Q}\), surface normal \(\mathbf {N}\), and albedo \(\mathbf {P}\) by minimizing the following energy function:

$$\begin{aligned} \begin{aligned} \{\mathbf {Q}^*, \mathbf {N}^*, \mathbf {P}^*\}&= \mathop {\mathrm{argmin}}\limits _{\mathbf {Q}, \mathbf {N}, \mathbf {P}} \left\| \mathbf {M} - \mathbf {Q}\mathbf {L}\mathbf {N}^\top \mathbf {P} \right\| _F^2, \end{aligned} \end{aligned}$$
(7)

where \(\Vert \cdot \Vert _F\) denotes the Frobenius norm. We define \(\mathbf {B} = \mathbf {P}^\top \mathbf {N} \in {\mathbb {R}}^{p \times 3}\) as albedo-scaled surface normals. Here, \(\mathbf {Q}\) is invertible since its diagonal elements are non-zero. Then we rewrite Eq. (6) as

$$\begin{aligned} \mathbf {Q}^{-1}\mathbf {M} - \mathbf {L}\mathbf {B}^\top = \mathbf {0}. \end{aligned}$$
(8)

After vectorizing the unknown parameters \(\mathbf {Q}^{-1}\) and \(\mathbf {B}^{\top }\), we obtain

$$\begin{aligned} \begin{aligned}&(\mathbf {I}_p \otimes \mathbf {L}) \mathrm{vec}(\mathbf {B}^\top ) \\&\quad - [\mathrm{diag}(\mathbf {m}_1) \cdots \mathrm{diag}(\mathbf {m}_p)]^\top \mathbf {Q}^{-1}\mathbf {1}= \mathbf {0}, \end{aligned} \end{aligned}$$
(9)

where \(\mathrm{vec}(\cdot )\) and \(\otimes \) represent vectorization and Kronecker product operators, respectively. \(\mathbf {I}_p \in {\mathbb {R}}^{p \times p}\) is an identity matrix, \(\mathbf {1}\) is a all-one f-dimensional vector, \(\mathbf {m}_i\) is the i-th column vector of the image observations \(\mathbf {M}\), indicating the measurement at the i-th pixel position.

By concatenating all unknowns of Eq. (9) into a vector, we obtain a homogeneous system of linear equations:

$$\begin{aligned} { \underbrace{ \left[ -\mathbf {I}_p \otimes \mathbf {L} | [\mathrm{diag}(\mathbf {m}_1)| \cdots |\mathrm{diag}(\mathbf {m}_p)]^\top \right] }_{\mathbf {D}} \underbrace{ \left[ \begin{array}{l} \mathrm{vec}(\mathbf {B}^\top ) \\ \mathbf {Q}^{-1}\mathbf {1} \end{array} \right] }_{\mathbf {x}} = \mathbf {0}},\nonumber \\ \end{aligned}$$
(10)

where \(\mathbf {D} \in {\mathbb {R}}^{pf \times (3p + f)}\), and the unknown vector \(\mathbf {x}\) has the dimension of \(3p + f\). If \(\mathbf {D}\) has 1d right nullspace, the solution of \(\mathbf {x}\) is obtained up to a scale via a factorization of \(\mathbf {D}\) by singular value decomposition (SVD). Based on the prior knowledge that surface normal has a unit norm, we normalize albedo-scaled surface normals \(\mathbf {B}\) in \(\mathbf {x}\) to finally obtain a unique surface normal estimation.

3.2 Minimal Conditions for a Unique Solution

As discussed before, to obtain a non-trivial solution of the homogeneous system in Eq. (10), the right nullspace of \(\mathbf {D}\) should be one dimension. Therefore, we have

$$\begin{aligned} pf \ge 3p+f-1. \end{aligned}$$
(11)

This solvable condition can be interpreted in another way. Given p pixels observed under f spectral bands, the total number of measurements is pf. Since we assume a monochromatic surface with spatially-varying albedos, we only need to know the uniform reflectance devoid of albedos \(\mathbf {q}\) for one pixel, whose number of unknowns is f. For the remaining \((p-1)\) pixels, we need to know albedos with the number of unknowns \((p-1)\). Besides, for each pixel, the surface normal has 2 degrees of freedom. There are thus 2p unknowns for surface normal. Totally, the number of unknowns is \(f + (p-1) + 2p = 3p + f - 1\). Since the number of measurements needs to be no less than the number of unknowns, we obtain the minimal solvable condition of Eq. (11).

To further analyze the minimal requirement for the number of lighting directions and pixels, we rewrite Eq. (11) as

$$\begin{aligned} (f-3)(p-1) \ge 2. \end{aligned}$$
(12)

Therefore, the minimal requirements for the number of input lighting directions and pixels to obtain a unique solution for SRT III surfaces are

$$\begin{aligned} \left\{ \begin{array}{lr} p=2, f \ge 5, &{} \\ f=4, p \ge 3, &{} \end{array} \right. \end{aligned}$$
(13)

which correspond to MPC and MLC in Theorem 1.

4 MPS for Surfaces with SRT IV

As discussed in the previous sections, general MPS for a surface with spatially-varying reflectance (both chromaticities and albedos) is ill-posed. In this section, we show that the MPS under this SRT IV is tractable if the light sources’ spectra E and camera spectral sensitivity S are calibrated in the form of a vector of their products \(\mathbf {e} = [E_1(\lambda _1)S_1(\lambda _1), \cdots , E_f(\lambda _f)S_f(\lambda _f)]^\top \) for f distinct spectral bands. By denoting the material reflectances of corresponding spectral bands as \(\mathbf {r} = [R(\lambda _1), \cdots , R(\lambda _f)]^\top \), then the image formation model for a pixel under f lights can be written as

$$\begin{aligned} \mathbf {m} = \mathrm{diag}(\mathbf {e})\mathrm{diag}(\mathbf {r})\mathbf {L}\mathbf {n}. \end{aligned}$$
(14)

Given the calibrated \(\mathbf {e}\), we compute the normalized image observations \(\hat{\mathbf {m}}\) for a pixel by \(\hat{\mathbf {m}} = \mathbf {m} \oslash \mathbf {e}\), where \(\oslash \) denotes element-wise division. Then MPS for the SRT IV surface can be formulated as a bilinear optimization of per-pixel surface normal \(\mathbf {n}\) and material spectral reflectance \(\mathbf {r}\):

$$\begin{aligned} \begin{aligned} \{\mathbf {n}^*, \mathbf {r}^*\}&= \mathop {\mathrm{argmin}}\limits _{\mathbf {n}, \mathbf {r}} \left\| \hat{\mathbf {m}} - \mathrm{diag}(\mathbf {r})\mathbf {L}\mathbf {n}\right\| _2^2. \end{aligned} \end{aligned}$$
(15)

The problem still has f constraints with \(f+2\) unknowns. We now show how this can be further made tractable by introducing the basis representation of the material reflectances in the next section.

4.1 Unique Solution for SRT IV

To reduce the number of unknowns in Eq. (15) and make the problem well-posed, a more compact representation for the spectral reflectance \(\mathbf {r}\) is needed.

We assume the spectral reflectance \(\mathbf {r}\) is non-zero anywhere and define an inverse spectral reflectance as \(\hat{\mathbf {r}} = \mathbf {1} \oslash \mathbf {r}\). Then the normalized image observations for one pixel satisfy

$$\begin{aligned} \mathrm{diag}(\hat{\mathbf {m}})\hat{\mathbf {r}} - \mathbf {Ln} = \mathbf {0}. \end{aligned}$$
(16)

In this expression, the inverse spectral reflectance \(\hat{\mathbf {r}}\) lies in a f-dimensional space. We approximate it with \(k~(< f)\) independent linear basis to reduce the number of unknowns, i.e.,

$$\begin{aligned} \hat{\mathbf {r}} = \mathbf {Bc}, \end{aligned}$$
(17)

where \(\mathbf {B} \in {\mathbb {R}}^{f \times k}\) is a basis matrix stacking k basis vectors, \(\mathbf {c} \in {\mathbb {R}}^{k}\) is the unknown basis coefficients. Combing Eqs. (16) and (17), we formulate the bilinear optimization of Eq. (15) as a homogeneous linear system,

$$\begin{aligned} \underbrace{ \left[ -\mathbf {L} | \mathrm{diag}(\hat{\mathbf {m}})\mathbf {B} \right] }_{\mathbf {A}} \underbrace{ \left[ \begin{array}{l} \mathbf {n} \\ \mathbf {c} \end{array} \right] }_{\mathbf {y}} = \mathbf {0}, \end{aligned}$$
(18)

where \(\mathbf {A} \in {\mathbb {R}}^{f \times (3 + k)}\), and \(\mathbf {y}\) has the dimension of \(3 + k\). Similar to Ours\(_{\mathrm{III}}\) discussed in Sec. 3, if \(\mathbf {A}\) has one-dimensional right nullspace, we can obtain a unique solution \(\mathbf {y}\) up to a scale by SVD. The estimated \(\mathbf {y}\) are chosen as the right-singular vector corresponding to the smallest singular value of \(\mathbf {A}\). By incorporating the unit norm constraint for the surface normal, we can finally resolve the scale ambiguity and uniquely obtain the estimation of per-pixel surface normal and spectral reflectance.

4.2 Spectral Reflectance Basis Extraction

Previous methods conduct linear analysis on MERL BRDF dataset Matusik (2003) and express the reflectances by the small number of coefficients associated with the basis vectors. However, their extracted bases are not suitable for MPS as the spectral information is omitted. In this paper, we provide spectral reflectance bases extracted from a spectral BRDF database.

Dupuy and Jakob (2018) provided a measured spectral BRDF dataset for 62 materials with 195 equi-spaced spectral bins covering the \(360 \sim 1000\) [nm] range. For each material, spectral responses for 8192 incident-outgoing direction samples are provided. Since we assume the Lambertian model, the spectral reflectances of 8192 directional samples for one material are treated as that of 8192 Lambertian materials independently. By stacking the spectral response of all materials at one wavelength as a row vector, we build a spectral material database \(\mathbf {G} \in {\mathbb {R}}^{195 \times 507904 (=62 \times 8192)}\).

With the wavelengths of f spectral lights calibrated, we obtain the corresponding spectral material database \(\tilde{\mathbf {G}} \in {\mathbb {R}}^{f \times 507904}\) by sampling the rows of \(\mathbf {G}\). To extract bases for the inverse spectral reflectance \(\hat{\mathbf {r}}\), we remove the materials with near-zero spectral responses at any of the f wavelengths in \(\tilde{\mathbf {G}}\) and conduct SVD on \(\hat{\mathbf {G}} = \mathbf {1} \oslash \tilde{\mathbf {G}}\) as

$$\begin{aligned} \hat{\mathbf {G}} = \mathbf {U} {\varvec{\Sigma }} \mathbf {V}^\top , \end{aligned}$$
(19)

where \(\mathbf {U}\) and \(\mathbf {V}\) are the left and right orthogonal singular vectors, and \({\varvec{\Sigma }}\) is a \(f \times f\) diagonal matrix containing the singular values in a descending order. The column vectors of \(\mathbf {U}\) provide orthogonal bases for the inverse spectral reflectance \(\hat{\mathbf {r}}\).

Determining the number of bases Following the Eckart-Young theorem Johnson (1963), we select the first k columns of \(\mathbf {U}\) as the basis matrix \(\mathbf {B} \in {\mathbb {R}}^{f \times k}\) to approximate the inverse reflectance \(\hat{\mathbf {r}}\). To obtain a non-trivial solution of Eq. (18), the number of independent basis vectors k should be selected to make \(\mathbf {A} \in {\mathbb {R}}^{f \times (k+3)}\) has a one-dimensional right nullspace. Therefore, the rank of \(\mathbf {A}\) should satisfy

$$\begin{aligned} \mathrm{rank}(\mathbf {A}) = k + 2 < f. \end{aligned}$$
(20)

We calculate the numerical rank of \(\mathbf {A}\) following the threshold strategy suggested in William et al. (2007) and iteratively increase the number of bases in \(\mathbf {B}\) from 1 to \(f-3\) until \(\mathbf {A}\) satisfies the rank requirement. Since our basis extraction is based on measured spectral BRDF dataset Dupuy and Jakob (2018) containing various spectral reflectance candidates in the real world, the obtained basis \(\mathbf {B}\) is expected to fit diverse scenes, as we will demonstrate it in the real data experiments.

5 Experiments on Synthetic Data

We here introduce experimental results on synthetic datasets. We first describe the details of synthetic data creation and the baseline settings. Then we compare Ours\(_{\mathrm{III}}\) and Ours\(_{\mathrm{IV}}\) with the existing MPS methods. Please check the electronic version for the experimental results in color.

Fig. 3
figure 3

Synthetic multispectral image rendering of a Bunny surface. The measured spectral BRDF “paper_yellow” roughly follows SRT III since its spectral response \(R(\lambda )\) under varying groups of surface normal and light directions can be represented by a common chromaticity \(C(\lambda )\) with varying scales (albedos) (Color figure online)

5.1 Experimental Settings

Synthetic dataset In our previous work Guo et al. (2021), we have verified that Ours\(_{\mathrm{III}}\) can accurately recover the surface normal on synthetic surfaces rendered with ideal SRT III reflectances. This paper gives a more realistic synthetic dataset with measured spectral reflectances. Similar to the synthetic shape and lighting direction distribution in Guo et al. (2021), we choose Bunny as our target shape and regularly sampled 24 synthetic light directions on a hemisphere with the elevation angle larger than \(45^\circ \). The light spectra of the LEDs are narrow-band with the central wavelengths distributed evenly in the range between \(400 \sim 750\) [nm].

To render the reflectance with SRT III, we choose a measured spectral BRDF “paper_yellow” Dupuy and Jakob (2018), whose appearance is visualized under a natural illumination in Fig. 3. As shown in the middle row of the figure, we plot part of the spectral reflectance curves \(R(\lambda )\) of the material under varying groups of surface normals and light directions. It is clear that most reflectance curves can be approximated by scaling the thick yellow curve labeled as chromaticity \(C(\lambda )\), except for a few curves. Therefore, surfaces rendered with “paper_yellow” roughly have a uniform chromaticity but spatially-varying albedos (SRT III). Following the above rendering setting, we generate a synthetic multispectral image with 24 channels. The observations under LEDs 1, 11, and 23 are visualized in the bottom row.

To render the reflectance with SRT IV, we select 4 different measured spectral BRDFs as shown in Fig. 4. The material distribution labels in the left-top indicate which BRDF to be applied to the regions on the Bunny surface. We render a synthetic multispectral image under the 24 lights and visualize it by concatenating the spectral channels illuminated by LEDs 1, 11, and 23, as shown at the left-bottom of the figure.

Fig. 4
figure 4

Synthetic rendering for the SRT IV surface. The spectral reflectance contains 4 materials as labeled by the material distribution mask. The material appearances are visualized under natural illumination Dupuy and Jakob (2018)

Baselines As the baseline of the experiments, we selected two state-of-the-art MPS methods: CS16 Chakrabarti and Sunkavalli (2016) and OS18 Ozawa et al. (2018), where we implemented OS18 Ozawa et al. (2018) and used released code of CS16 Chakrabarti and Sunkavalli (2016) for evaluation. Since both methods take a 3-channel (i.e., RGB) image as input, we selected 3 out of 24 spectral observations to mimic the 3-channel input image, as shown in Fig. 3. To verify the MLC, we tested our method for SRT III surfaces by assigning the spectral channels recording the observations under LEDs 1, 11, 21, and 23, which cover the observations used in OS18 Ozawa et al. (2018) and CS16 Chakrabarti and Sunkavalli (2016) for comparison. The number of piece-wise constant chromaticities need to be set manually in CS16 Chakrabarti and Sunkavalli (2016). To make a fair comparison, we set the number of chromaticities to be 1 and evaluate their method, Ours\(_{\mathrm{III}}\), and the SRT II module of OS18 Ozawa et al. (2018) in the experiments of SRT II and III surfaces. When making comparisons on SRT IV surfaces, we use the default number of chromaticity clusters to 100 in CS16 Chakrabarti and Sunkavalli (2016), and compare it with Ours\(_{\mathrm{IV}}\) and the SRT IV module of OS18 Ozawa et al. (2018). Besides, in the synthetic experiments, we remove the materials used in the test data from the spectral reflectance database when extracting the bases for Ours\(_{\mathrm{IV}}\).

In the following, Ours\(_{\mathrm{III}}\) and Ours\(_{\mathrm{IV}}\) are given observations under all 24 lights by default. Ours\(_{\mathrm{III}}(f_4)\) denotes our method for SRT III surfaces under MLC.

Fig. 5
figure 5

Surface normal estimation results for an SRT III surface shown in Fig. 3

5.2 Surface Normal Estimation Under SRT III

Using the ground-truth surface normal, we evaluated surface normal estimation accuracy by mean angular errors (MAE) in degree. Figure 5 shows the results of surface normal estimation for a synthetic SRT III surface. Ours\(_{\mathrm{III}}\) achieves the smallest angular error compared to the other methods. The estimation errors of OS18 Ozawa et al. (2018) and CS16 Chakrabarti and Sunkavalli (2016) are mainly caused by their SRT II assumption and shadows. Also, the local polynomial shape regularization used in CS16 Chakrabarti and Sunkavalli (2016) additionally brings in errors in regions with large surface normal variations. Ours\(_{\mathrm{III}}(f_4)\) under MLC is less accurate than Ours\(_{\mathrm{III}}\) due to the influence of shadows. However, compared with OS18 Ozawa et al. (2018) and CS16 Chakrabarti and Sunkavalli (2016), Ours\(_{\mathrm{III}}(f_4)\) achieves higher accuracy with only one additional spectral observation appended to the input. This result demonstrates the effectiveness of our method on SRT III surfaces. In this setting, Ours\(_{\mathrm{IV}}\) is less accurate compared to Ours\(_{\mathrm{III}}\) due to its flexible representation power for this restricted setting.

Fig. 6
figure 6

Surface normal estimation comparison on the SRT IV surface shown in Fig. 4

5.3 Surface Normal Estimation Under SRT IV

Figure 6 shows the surface normal estimation results of a surface with spatially-varying spectral reflectance (SRT IV). Ours\(_{\mathrm{IV}}\) can handle spatially-varying chromaticities and albedos, therefore producing more accurate surface normal recovery compared to Ours\(_{\mathrm{III}}\) that assumes the uniform chromaticity. Compared to OS18 Ozawa et al. (2018) and CS16 Chakrabarti and Sunkavalli (2016), Ours\(_{\mathrm{IV}}\) obtains the smallest angular error since we do not assume piece-wise constant spectral reflectances and require no reflectance clustering. From the error map shown in Fig. 6, the error distribution of Ours\(_{\mathrm{IV}}\) is more uniform and has less correlation to the material distribution compared to the other methods. This result shows the strength of Ours\(_{\mathrm{IV}}\) on surfaces with spatially-varying reflectances.

6 Real-World Experiment

To assess the effectiveness of the proposed methods, we built a multispectral photometric stereo setup to conduct experiments on real data. To verify the applicability of our methods on e-Heritage, we choose reliefs and statues shown in Figs. 8 and 9 with diverse spectral reflectances.

Fig. 7
figure 7

Ground-truth surface normal, chromaticity, and albedo of three real objects: Head-relief, Love-relief and Buddha-relief, where the chromaticity is visualized by mapping 450 nm, 550 nm and 650 nm responses to BGR color channels, respectively. The spectral reflectances for the three reliefs can be categorized as SRT II to IV from top to bottom, as seen by their centralized albedo histograms and the distributions of the chromaticities projected to the 2D space via MDS Cox and Cox (2008) (Color figure online)

6.1 Hardware Setup

Figure 1 (left) shows our multispectral photometric stereo setup, lighting direction and light spectra distributions. Our setup consists of 12 narrow-band spectral light sources and a monochromatic camera (FLIR Blackfly S). The light sources are fixed on a metal frame rig and distributed uniformly around the camera’s optical axis to avoid biased light distributions. We calibrated the light directions with a monochromatic mirror ball following the method by Shi et al. (2019). The central wavelength of our spectral light sources uniformly spans in the range of \(400 \sim 750\) [nm], and they are measured by a spectrometer Sekonic C-800. To verify our method without the influence of crosstalk across wavelength channels, we captured multiple images with a monochromatic camera by turning on each spectral light source one after another. Spectral observations under LEDs 2, 4, and 10 with the central wavelength 450nm, 550nm, and 650nm are selected to mimic the RGB input for existing 3-channel MPS methods. We used 4 spectral observations under the illumination of LEDs 2, 4, 9, and 11 to verify the MLC of our method for SRT III surfaces (Ours\(_{\mathrm{III}}\)).

To obtain the baseline surface normal (we call it the ground-truth (GT) surface normal hereafter), we additionally put an LED board that contains 256 white light sources sharing the same spectrum, in a similar manner to CS16 Chakrabarti and Sunkavalli (2016). The GT surface normal is estimated using a conventional Lambertian least-squares photometric stereo Woodham (1980), and we use it for quantitatively assessing the MPS results.

Spectral calibration For our method for SRT IV surfaces (Ours\(_{\mathrm{IV}}\)), light sources’ spectra \(E_1, \ldots , E_f\) and camera spectral sensitivity \(S_1, \ldots , S_f\) need to be calibrated in the form of a vector of their products \(\mathbf {e} = [E_1(\lambda _1)S_1(\lambda _1), \cdots , E_f(\lambda _f)S_f(\lambda _f)]^\top \). For the calibration, we use a MacBeth ColorChecker board McCamy et al. (1976) consisting of 24 patches of uniform spectral reflectances \(R_1, \ldots , R_{24}\). Based on the image formation model of Eq. (14), the ratio of the vector \(\mathbf {e}\)’s elements at neighboring spectral channels follows

$$\begin{aligned} \frac{e_{j+1}}{e_{j}} = \frac{m_{j+1}}{m_j} \frac{R(\lambda _j)}{R(\lambda _{j+1})} \frac{\mathbf {l}_j^\top \mathbf {n}}{\mathbf {l}_{j+1}^\top \mathbf {n}}. \end{aligned}$$
(21)

For a scene point on the ColorChecker board, the spectral reflectance ratio \(\frac{R(\lambda _j)}{R(\lambda _{j+1})}\) under different wavelengths is known from measured spectral reflectance curves Mohammadi et al. (2005). The surface normal \(\mathbf {n}\) of the ColorChecker board can be estimated by the detected image corners and camera intrinsics Zhang (2000). With calibrated lighting directions \(\mathbf {L}\) and the multispectral observations \(\mathbf {m}\), we estimate the elements of \(\mathbf {e}\) up to scale by solving the homogeneous system of equations derived from Eq. (21) using all the 24 monochromatic patches of the ColorChecker board. Since we can only recover \(\mathbf {e}\) up to scale, the spectral reflectance estimation by Ours\(_{\mathrm{IV}}\) naturally has a scale ambiguity, but that does not influence the recovery of surface normals.

Fig. 8
figure 8

Surface normal estimation results for real-world objects with SRT II (Head-relief) and SRT III (Love-relief and Moai statue)

6.2 Real Data Setup

Based on our hardware setup, we capture a variety of objects for real data experiments. Prior to the experiment, we examine the SRTs of the scenes by analyzing their spectral reflectance distributions, as shown in Fig. 7. With calibrated \(\mathbf {e}\), known light directions \(\mathbf {L}\) and the ground-truth surface normal \(\mathbf {n}\), we compute the spectral reflectance \(\mathbf {r}\) based on the spectral image formation model shown in Eq. (14). The estimated reflectance \(\mathbf {r}\) is further decomposed into the albedo and chromaticity by taking its norm as albedo and its direction as chromaticity as depicted in Fig. 7 as GT albedo and GT chromaticity, respectively. The chromaticity is visualized by mapping the responses at 450 nm, 550 nm, and 650 nm to BGR color channels, respectively.

The last two columns of Fig. 7 show the histogram of centralized albedo by subtracting the mean value, and low-dimensional visualization of chromaticity distributions via multidimensional scaling (MDS) Cox and Cox (2008), respectively. The Head-relief has a relatively uniform albedo compared to the Love-relief and Buddha-relief since its standard derivation \(\sigma \) of albedos is smaller than the other two. This is also consistent with the image observations shown in the first column. On the other hand, the chromaticity distribution of Buddha-relief is more diverse than those of Head-relief and Love-relief, which indicates the spatially-varying chromaticity distribution in Buddha-relief. As such, the spectral reflectances of the three real reliefs roughly follow SRT II, III, and IV.

We also observed that piece-wise constant spectral reflectance assumption used in Ozawa et al. (2018), Chakrabarti and Sunkavalli (2016), Anderson et al. (2011b) is relatively unpractical to approximate the general SRT IV surfaces. Although Buddha-relief seems to contain only three piece-wise constant chromaticity regions from the image observation under natural illumination, it actually has diverse chromaticities, making the monochromatic region clustering Ozawa et al. (2018); Chakrabarti and Sunkavalli (2016); Anderson et al. (2011b) unstable.

Fig. 9
figure 9

Surface normal estimation results for surfaces with spatially-varying chromaticities and albedos (SRT IV): Buddha-relief, Lion, and Puppy

Fig. 10
figure 10

Robustness against specular highlights. Top two rows show 7 sampled spectral images and their shadings of a cow surface Shi et al. (2019) covered by material “cc_green_malachite” Dupuy and Jakob (2018), where yellow and green boxes indicates regions with specular highlights and shadows. Surface normal estimates from existing methods and ours, the corresponding angular error distributions, and the mean angular error values are shown in the bottom two rows

Fig. 11
figure 11

Shape estimation results of two shiny objects, where Dog is monochromatic and Shell has spatially-varying chromaticities. Even rows show estimated surface normals. Odd rows provide reconstructed surfaces integrated from the surface normal maps. Closed-up views show the artifacts caused by the specular highlights

6.3 Surface Normal Estimation Results on Real Data

Surface normal estimation under SRT III As shown in Fig. 8, we compare our methods with baselines on three objects: Head-relief, Love-relief, and Moai statue. The Head-relief scene follows SRT II, and Love-relief and Moai statue follow SRT III. Since both existing methods Ozawa et al. (2018); Chakrabarti and Sunkavalli (2016) and our methods (Ours\(_{\mathrm{III}}\), Ours\(_{\mathrm{IV}}\)) can handle SRT II, the accuracy of recovered surface normals are comparable.

We observed large normal estimation errors by CS16 Chakrabarti and Sunkavalli (2016) and OS18 Ozawa et al. (2018) on the Love-relief and the Moai statue, since the spatially-varying albedos violate the assumptions made in their methods. The error maps of CS16 Chakrabarti and Sunkavalli (2016) and OS18 Ozawa et al. (2018) on the Love-relief highlight the error regions due to the non-uniform albedo distribution. On the other hand, Ours\(_{\mathrm{III}}\) yields more accurate surface normal estimation results, which verifies our method’s strength on SRT III surfaces. Under minimal solvable lighting conditions (MLC), the estimation errors of Ours\(_{\mathrm{III}}\)(\(f_4\)) increase compared to using all the 12 lights (Ours\(_{\mathrm{III}}\)(\(f_{12}\)), which is mainly caused by the shadows at the concave regions.

Ours\(_{\mathrm{IV}}\) provides comparable results with Ours\(_{\mathrm{III}}\) on both SRT II and III. However, Ours\(_{\mathrm{IV}}\) requires the spectral calibration of both lights and camera as well as the spectral reflectance bases. Therefore, it is preferred to apply Ours\(_{\mathrm{III}}\) for monochromatic surfaces.

Surface normal estimation under SRT IV Figure 9 shows surface normal estimation results of three SRT IV surfaces: Buddha-relief, Lion, and Puppy. CS16 Chakrabarti and Sunkavalli (2016) and OS18 Ozawa et al. (2018) assume the surface contains a limited number of regions with uniform spectral reflectances. However, based on the distribution of albedos and chromaticities shown in Fig. 7, such assumption is invalid in the Buddha-relief. Also, it is difficult to infer the number of distinct albedos and chromaticities in the Lion and Puppy from the image observation. Therefore, both methods results in inaccurate surface normal estimates for these scenes.

Ours\(_{\mathrm{III}}\) cannot handle spatially-varying chromaticities and outputs large errors on both scenes as well. On the other hand, the proposed method Ours\(_{\mathrm{IV}}\) achieves accurate results because it explicitly accounts for the SRT IV surfaces. From the error map, it is seen that inaccurate surface normal estimates are mainly located at the regions where shadows are observed, and the surface normal estimation accuracy is not influenced by the spatially-varying reflectances in the results of Ours\(_{\mathrm{IV}}\).

Fig. 12
figure 12

Dynamic shape recovery of a deforming surface with SRT IV. The first row shows the image observations of a multispectral video frame at varying bands. The last three rows provide the estimated surface normals and integrated surface visualizations at varying viewpoints. Close-up views highlight surface shape details

7 Discussion

In this section, we discuss our method’s robustness against outliers and applicability to dynamic scene reconstruction.

7.1 Robustness Against Outliers

Although previous methods Chakrabarti and Sunkavalli (2016); Ozawa et al. (2018) provide a unique solution for SRT II without external priors, their input is restricted to 3-channel RGB image and cannot take more bands (see “Appendix”). On the other hand, our methods for both SRT III and SRT IV surfaces can handle multispectral images with 4 or more spectral channels. This capability of taking many spectral channels allows us to use a robust estimation approach in MPS, in a similar spirit to four or more source photometric stereo methods Barsky and Petrou (2003); Wu et al. (2010); Shi et al. (2014), to make our method robust against shadows and specular highlights. Intuitively, having more spectral channels allows us to discard some of them that are corrupted by outliers.

To demonstrate this capability, we use a per-pixel thresholding strategy used in Shi et al. (2019), Shi et al. (2014) to discard outliers from the input observations. Specifically, for each pixel, we sort the observations under varying lights based on the brightness, and discard shadows and specular highlights as outliers that correspond to dark and bright observations (top and bottom \(25\%\)). The surface normal and the spectral reflectance can then be estimated using the inlier image observations. In the following, we denote the robust versions of our SRT III method as “Ours\(_{\mathrm{III}}\)(r),” and our SRT IV method as “Ours\(_{\mathrm{IV}}\)(r).”

In Fig. 10, we test the robust estimation methods in comparison to our non-robust versions and previous methods on a cow scene Shi et al. (2019) with its reflectance assigned by a measured spectral BRDF “cc_green_malachite” Dupuy and Jakob (2018). With a few spectral channels (3 spectral bands for CS16 Chakrabarti and Sunkavalli (2016) and OS18 Ozawa et al. (2018), 4 spectral bands for Ours\(_{\mathrm{III}}\)(\(f_4\)) and Ours\(_{\mathrm{IV}}\)(\(f_4\))) in the multispectral image input, recovered surface normals are less accurate based on the mean angular error values. From the error distributions, inaccurate surface normal estimates are closely related to the distributions of specular highlights and shadows, as indicated by the yellow and green boxes. By adding more spectral bands, the recovered surface normals from our methods (Ours\(_{\mathrm{III}}\)(\(f_{24}\)), Ours\(_{\mathrm{IV}}\)(\(f_{24}\))) are improved. The accuracy of our method for SRT IV is relatively better near the shadow areas even without the robust strategy, since attached shadows are inherently embedded in the spectral BRDF database Dupuy and Jakob (2018), from which we extract the BRDF bases. However, both Ours\(_{\mathrm{III}}\)(\(f_{24}\)) and Ours\(_{\mathrm{IV}}\)(\(f_{24}\)) still suffer from the influence of specular highlights. By further removing specularities and shadows as outliers using the robust estimation strategy, more accurate surface normals are estimated by Ours\(_{\mathrm{III}}\)(r) and Ours\(_{\mathrm{IV}}\)(r), illustrating the benefit brought by our method’s capability of taking arbitrarily many channels as input.

We further evaluate our robust estimation method on real objects with shiny surfaces: Dog and Shell shown in Fig. 11. Since the reflectances of the two objects significantly deviate from the Lambertian reflectance, we cannot trust the surface normal estimated from conventional least-squares photometric stereo Woodham (1980) as the ground truth. Therefore, instead of comparing the surface normal maps, we applied a surface normal integration method Xie et al. (2014) to reconstruct 3D shapes from estimated surface normals for a qualitative comparison.

As shown in Fig. 11, the recovered surface shape from few spectral image observations is heavily influenced by specular highlights. We also observe shape distortions at the middle region of Shell in the result of OS18 Ozawa et al. (2018) and CS16 Chakrabarti and Sunkavalli (2016). These are caused by the inaccurate chromaticity clustering for the spatially-varying reflectances. By adding more spectral bands under varying lighting directions as input (Ours\(_{\mathrm{III}}\)(\(f_{12}\)) and Ours\(_{\mathrm{IV}}\)(\(f_{12}\))), shape recovery becomes more plausible. However, artifacts caused by specularities still remain. By further discarding outlier of specular highlights, more convincing shape reconstruction results are obtained from the robust version of our method (Ours\(_{\mathrm{III}}\)(r) and Ours\(_{\mathrm{IV}}\)(r)).

7.2 Dynamic Shape Recovery

We further test the applicability of our method to dynamic scenes using an industrial multispectral camera IMEC-SM-VISFootnote 2, with which image observations at different spectral bands are obtained at once in one shot. As shown in Fig. 12, we estimate the dynamic shape of a deformable SRT IV surface in motionFootnote 3 and compare the result with OS18 Ozawa et al. (2018) and CS16 Chakrabarti and Sunkavalli (2016). We choose four pairs of spectral lights and camera channels having the strongest response at 480nm, 520nm, 590nm, and 635nm to obtain the multispectral input. Three out of the four channels at 480nm, 520nm, and 635nm are used as the input for OS18 Ozawa et al. (2018) and CS16 Chakrabarti and Sunkavalli (2016). The recovered shapes (cushion) of OS18 Ozawa et al. (2018), CS16 Chakrabarti and Sunkavalli (2016) and Ours\(_{\mathrm{III}}\) are relatively flat due to the influence of spatially-varying reflectances, as shown in the side view of integrated surfaces. Also, the shape details are lost in CS16 Chakrabarti and Sunkavalli (2016) due to the polynomial local shape constraint, as highlighted in the close-up views. On the other hand, the surface normal estimates of Ours\(_{\mathrm{IV}}\) are unaffected by the spatially-varying spectral reflectance. As a result, Ours\(_{\mathrm{IV}}\) achieves more reasonable dynamic shape recovery results on the deformable SRT IV surface.

The dynamic shape recovery from our method has a potential to capture 3D movement and gesture of the human body, which may benefit the preservation of intangible cultural heritages such as traditional dances.

8 Conclusion

In this paper, we show that MPS can be turned into a well-posed problem and provide unique solutions for surface normals under two general spectral reflectance types. Specifically, if the surface has uniform chromaticity but spatially-varying albedos (SRT III), we show that surface normal can be uniquely determined from \(4+\) spectral observations without introducing external priors. By further calibrating the light spectra and the camera spectral sensitivity, we present a closed-form solution of surface normal and spectral reflectance for surfaces with spatially-varying chromaticities and albedos (SRT IV), using a low-rank basis representation of the spectral reflectance. Since our methods can take more than 4 spectral channels, our method can rely on outlier rejection strategies in the MPS setting to effectively remove shadows and specular highlights. From the experiments on real objects containing statues and reliefs, we demonstrate the potential applicability of our method to e-Heritage.

8.1 Future Work

To obtain a surface shape from a single-shot image, we encode image observations under different illuminations at different spectral bands. Compared to the setting in CPS, this setting requires a negligible crosstalk effect Chakrabarti and Sunkavalli (2016); Ozawa et al. (2018), i.e., each spectral channel only records the image measurement under the corresponding spectral light. From a practical viewpoint, it is wanted to deal with the non-negligible crosstalk effect, which alleviates the requirement of the hardware setting in MPS. Our MPS method is based on Lambertian reflectance assumption and treats specular highlights as outliers. It is interesting to explore the MPS solution method under general non-Lambertian spectral reflectances. Due to the inaccessibility to actual heritage objects, we instead verified our method’s applicability to e-Heritage by real-world objects (Buddha relief, lion and Moai statues) that have similar appearances to the heritage objects. We are interested in applying our method to real heritages as soon as we have a chance in the future.