Keywords

1 Introduction

Digital archiving describes an activity that is intended to scan cultural heritage items, store them as digital data, and use them for inheritance and analysis [1, 2]. In recent years, it has become possible to acquire complex 3D-shaped cultural heritage items as point cloud data by scanning using a laser or photography. To make use of point clouds in digital archives, it is necessary to accurately visualize the digital data obtained by scanning, with which the observer can easily grasp the structure. See-through visualization is one of the visualization methods that can support structural understanding. The advantage of see-through visualization is that internal structures can be visualized even in structures with complex shapes. As a general method for visualizing the internal structure of a building, there is a method of creating a cross section using CAD, etc, but simultaneous visualization with external structures cannot be realized with this method. On the other hand, see-through visualization allows simultaneous visualization of internal and external structures. Stochastic point-based rendering (SPBR) [3, 4] has been developed as one of the see-through visualization methods. The method is good at visualization of large-scale 3D point clouds because of its advantage of low calculation cost. In addition, using a stochastic algorithm makes it unnecessary to sort point clouds, and this method is suitable for digital archives in that depth information can be accurately visualized. As a visualization method to support structural understanding of digital data, use of stereoscopic 3D images is also effective. In previous studies, for the purpose of developing a new representation method of 3D digital data, contents for digital archives using a 3D display and VR have been produced [5]. However, all studies have used opaque images and objects instead of see-through images. In this research, we propose a visualization method that combines see-through visualization and stereoscopic vision for the 3D point cloud of tangible cultural heritage items. By using stereoscopic vision, the observer can recognize the shape more accurately because the observer can be given a depth cue that is not given in the 2D image. However, while see-through visualization makes it possible to visualize the internal structure, the visualization results become complicated because multiple objects appear to overlap. As a result, the effects of “occlusion” and “motion parallax”, which are depth cues obtained by stereoscopic vision, are lost, and the accuracy of depth perception is considered to be reduced. The “motion parallax” is an important cue for recognizing the direction and the distance of the depth, and is an effect that is caused when the observer’s head actively moves [6]. “Occlusion” is an important cue to recognize the direction of depth, and it is an effect that is caused when the object in front is hiding the object in back [6]. In previous studies, it is reported that depth is underestimated in see-through stereoscopic vision using SPBR [7]. In digital archives, it is necessary to present the visualization results of digital data correctly to the observer, and misleading representation methods are not suitable. Therefore, it can be noted that there is room for improvement in the current visualization methods in which the depth is underestimated. As a method for improvement, it seems easy to draw a scale on the side of the object to be visualized. However, this method ultimately has no effect because the observer cannot identify the object the drawn scale points to if the relationship of the front and back positions of the object that is drawn as see-through is unknown. In other words, in order to improve the accuracy of depth perception, it is necessary to increase the visibility of the object itself to be visualized as see-through. In this research, we propose two visualization methods of visual guides to improve underestimation of depth due to loss of depth cues. The first is a method to extract high curvature portions of a 3D structure using the eigenvalues obtained by principal component analysis, and to highlight only those portions. This method is confirmed to be effective in suppressing the decrease in the visibility of shapes that occur as the opacity decreases in 2D images [8]. In the see-through stereoscopic vision, it is possible to obtain the effects of “occlusion” and “motion parallax” without losing the see-through effect by drawing only those parts as opaque that are particularly necessary for grasping the shape. The second is a method to make the edges obtained by feature extraction of dashed lines. It is possible to obtain the effect of “texture gradient” which is a depth cue, by making the edges dashed lines. “Texture gradient” is proven to be obtained from the change of a uniform pattern given to an object [9]. In general, it is considered that changes in size, density, and aspect ratio of the texture produce a sense of depth. It is difficult to identify line segments because the 3D point cloud obtained by scanning does not have connection information between points. In this approach, we propose a method to identify line segments in a pseudo manner by focusing on the first principal component vector.

2 Conventional Methods Used in This Research

2.1 Feature Region Extraction Method

We describe the feature region extraction method used in this research. The feature region refers to the high curvature part, such as the vertex or the boundary between the surface and the surface (ridgeline) of a 3D structure. In this method, only the parts have high curvature in the 3D structure are extracted. First, principal component analysis is performed on the coordinate values of the point cloud that exists inside the sphere, centering on the point to be processed, and the feature value is defined by the combination of the obtained covariance matrix eigenvalues. Next, the feature values are calculated for all points, and only points with feature values larger than an arbitrarily set threshold are extracted. The Change of curvature [10] is effective as a feature value to extract only vertices and ridgelines. The feature value Change of curvature \(C_{\lambda }\) is defined by formula (1):

$$\begin{aligned} C_{\lambda } = \frac{\lambda _{3}}{\lambda _{1} + \lambda _{2} + \lambda _{3}}. \end{aligned}$$
(1)

In formula (1), \(\lambda _{1}\), \(\lambda _{2}\) and \(\lambda _{3}\) are the eigenvalues of the covariance matrix, and \(\lambda _{1}> \lambda _{2}> \lambda _{3}\). Since the ratio of the minimum eigenvalue \(\lambda _{3}\) tends to be larger at the vertices or ridgelines of the 3D structure, the value of \(C_{\lambda }\) is larger than that of the surface. An extraction example is shown in Fig. 1. Figure 1(a) is a result of the see-through visualization of surface data of a cuboid, and Fig. 1(b) is a visualization of only the feature region of the data in Fig. 1(a). The radius of the search sphere for principal component analysis is 1/150 of the diagonal length of the bounding box, and the threshold for feature extraction is 0.03. By changing the color of the feature region to red and rendering it opaque as shown in Fig. 1(b), it becomes a “visual guide” that supports shape recognition. The result of fusion visualization of the surface data and feature region highlighting is shown in Fig. 1(c). Compared to Fig. 1(a), it can be confirmed that the visibility of the vertices and the ridgelines are enhanced by the change of color and the increase of the opacity. This effect is predicted to enhance the effects of “motion parallax” and “occlusion” in stereoscopic vision.

Fig. 1.
figure 1

An example of feature region extraction using a cuboid (Color figure online)

2.2 Dashed Line Method

In this research, the edge is used as a texture by changing the edge visualized as solid lines to dashed lines. The dashed lines make it possible to identify the relationship of front and back positions of the object, and it can be a cue to recognize the depth magnitude by changing the “element size” and “density” of the texture in the viewing direction. The procedure for creating the dashed line is as follows.

First, feature regions (edges) are extracted by the method described in Sect. 2.1. In this method, solid lines are changed to dashed lines by “cutting out” edges once extracted. Next, the feature values given to the points are zeroed at regular intervals in parallel to a plane vertical to the axis. Points with a zero feature value are not extracted because their values are smaller than the threshold. In this method, it is possible to change the size and density of the dashed lines toward the gaze direction by changing a parameter called “interval value” that determines the interval of “cutting out” the edges. The “interval value” is calculated from the diagonal length of the bounding box.

An example of visualization is shown in Fig. 2. The data we are using is “Hachiman-yama”, one of a “Yamahoko” used at the Gion Festival. Figure 2(a) and (b) show the result of visualization of Hachiman-yama using opaque point rendering and the result of visualization with SPBR applied. The result of visualizing only the feature region using the feature value linearity [10] is shown in Fig. 2(c). The threshold is 0.35. Further, a result of uniformly changing the whole feature region to dashed lines using the dashed line method is shown in Fig. 2(d). The interval value is 1/150. In addition, the ratio of the length of one line segment constituting a dashed line to the length of the interval between the line segments is 1:1. Finally, Fig. 2(e) shows the visualization results in which the length and the interval of the dashed lines are changed toward the gaze direction. The interval value is 1/300. The ratio of the length of one line segment constituting the dashed line to the length of the interval between the line segments changes sequentially from 10:1 to 1:1 from the area closest to the viewpoint. Comparing Fig. 2(c) and (d), it is possible to confirm that the edges change to dashed lines while leaving the edges necessary for shape recognition. Moreover, Fig. 2(e) makes it easier to distinguish between the edges present in the front and the edges present in the back compared with Fig. 2(c) and (d). In addition, it can be confirmed that the size and density of the dashed lines are changed toward the gaze direction.

Fig. 2.
figure 2

Visualization results of Hachiman-yama

3 Evaluation Experiment on See-Through Stereoscopic Vision

3.1 Depth Cues in See-Through Stereoscopic Vision

In this chapter, we describe the experiment to verify the effects of feature region highlighting and dashed lines in see-through stereoscopic vision. In this experiment, we used a multiview autostereoscopic display MV-4200 manufactured by TRIDELITY for presenting stereoscopic images. This display uses a lenticular lens system, which can give the observer motion parallax (5 parallaxes) in the horizontal direction. It is designed to obtain the maximum stereoscopic effect when the distance between the display and the observer is 3.0 m.

The effects of “motion parallax” and “occlusion”, which are depth cues, are as described in Sect. 1. However, the effect of “occlusion” is lost in see-through visualization because the number of objects to be visualized increases with the uniform reduction of the inside and outside opacity. In addition, previous research shows that depth is underestimated in see-through stereoscopic vision using SPBR, and it is reported that accuracy decreases with decreasing opacity [7]. This research provides a “visual guide” that serves as a cue for shape recognition by visualizing feature regions as opaque. As a result, the effects of “motion parallax” and “occlusion” are enhanced and the accuracy of depth perception is improved without losing the advantage of the see-through characteristic. Furthermore, as described in Sect. 2.2, in this research, the effect of “texture gradient”, which is one of the depth cues, is incorporated by performing processing to make the extracted edges dashed lines.

3.2 Conditions of the Feature Region Highlighting Experiment

In this experiment, three types of cuboids that had different depth lengths are used as an initial experiment to verify the effect. In addition, these cuboid data had two squares. Assuming that the length of one side of a square is 1.0, the length of each side in the z-axis direction representing the depth of each cuboid is 0.5, 1.0, and 2.0. Hereafter, each cuboid used for the experiment is called “0.5\(\times \) cuboid”, “1.0\(\times \) cuboid”, and “2.0\(\times \) cuboid”. Figure 3 shows three types of cuboid polygon data. Poisson disc sampling [11] is used for each of the point-sampled data. SPBR is applied to make it a see-through image. The color of the point cloud is cyan.

The experimental conditions are as follows. The stimulus images used in this experiment are of six types that combine the conditions of cuboids (3 types) and the condition of the presence or absence of feature region highlighting (2 types). Figure 4 shows the stimulus image used in this experiment. This figure is an example of 2.0\(\times \) cuboid. The subjects were asked to perform 24 trials in which the presence or absence of motion parallax (2 types) and the presence or absence of binocular parallax (2 types) are combined for 6 types of stimulus images. This experiment is performed with all the conditions in random order. At the time of image switching, a black image is presented so that no afterimage is left on the retina. The subjects are twenty men and women in their 20s and 30s. Prior to the experiment, the Titmus stereo fly test [12] is performed on all subjects to confirm a healthy stereoscopic vision function. The subject answers the question regarding the length of the side of the cuboid extending in the z-axis direction, assuming that the length of one side of the square is 1.0. For example, if the stimulus image is a cube, the correct value is 1.0. When the condition is “monocular”, the subject hides one eye with an occluder and observes with only the dominant eye. In addition, when the condition is “no motion parallax”, the subject’s head is positioned on the chin rest and remains fixed so that there is no movement. The height of the seat is adjusted for each subject so that the height of the head matches the height of the display.

Fig. 3.
figure 3

Polygon data of three types of cuboids (Color figure online)

Fig. 4.
figure 4

Visual stimuli (cuboid with H:W:D = 1.0:1.0:2.0) used in the experiment

3.3 Experimental Result of Feature Region Highlighting

We use ANOVA and Tukey’s HSD test for analysis. The experimental results are shown in Fig. 5. The left and right figures show the effect of edge highlighting and the interaction between edge highlighting and motion parallax, respectively. The vertical axis in the figure is the error of the perceived depth, and the formula is (2) as follows:

$$\begin{aligned} \varepsilon = \frac{\mathrm {Measured}\; \mathrm {value}-\mathrm {Correct}\; \mathrm {value}}{\mathrm {Correct}\; \mathrm {value}}. \end{aligned}$$
(2)

Error bars represent standard errors of the mean. As shown in the left figure of Fig. 5, in all cuboids, the condition “edge highlighting” improves the accuracy of depth perception and its effectiveness is obvious. The highlighted edges extending toward the gaze direction may have played an important role in this improvement as a linear perspective cue because the visibility of these edges were more enhanced than those of the front and back squares. This result suggests that the shape recognition can be enhanced by edge highlighting.

Next, we describe the interaction effect of edge highlighting and motion parallax on the right figure. On the 1.0\(\times \) and 2.0\(\times \) cuboids, the condition “edge highlighting and motion parallax” is more accurate than all other conditions. The apparent reason is the highlighted edges perpendicular to the motion parallax direction amplified the effect of motion parallax. In the case of the 0.5\(\times \) cuboid, the depth of the cuboid itself is small and the amount of parallax is also small accordingly, so, the effect did not reach statistical significance. However, the same figure indicates that the accuracy tends to be improved under the same conditions, even in the case of a 0.5\(\times \) cuboid. From the above results, it is possible to confirm the effectiveness of edge highlighting in see-through stereoscopic vision. However, in all experimental results, depth is underestimated and there is still room for improvement.

Fig. 5.
figure 5

Experimental result of the edge highlighting experiment

3.4 Conditions of the Dashed Edge Experiment

In this experiment, three types of cuboid created by the same procedure as the experiment in Sect. 3.2 are used. The length of the side in the z-axis direction of each cuboid is 0.6 times, 1.2 times, and 1.8 times.

The experimental conditions are as follows. The stimulus images used in this experiment are of 12 types, combining the conditions of cuboids (3 types) and feature region highlighting methods using dashed edges (4 methods). Four methods of feature region highlighting are shown in Fig. 6. The figure is an example of a 1.8\(\times \) cuboid. The feature region is extracted using the Change of curvature, and the threshold is 0.03. Figure 6(a) shows the conventional feature region highlighting, and Fig. 6(b) shows the edge extracted in Fig. 6(a) changed to dashed edges that were uniform in the simulated 3D scene rather than in the 2D image presented on the display. In addition, in Fig. 6(c) and (d), the length of the line segment and the length of the interval are changed toward the gaze direction. The ratio of the length of one line segment composed the dashed line to the distance between the line segments is 1:1 in the whole region. In Fig. 6(c), the length modulation of the line segment in the 2D image due to perspective was exaggerated by a factor of two. In Fig. 6(d), the length modulation of the line segment due to perspective was reversed and exaggerated by a factor of two. The interval values in Fig. 6(b)(c)(d) are all 1/150. Hereafter, the enhancing methods in Fig. 6(a)(b)(c)(d) are called “solid line” “dashed line” “(texture gradient) enhanced dashed line” and “(texture gradient) reverse enhanced dashed line”. A total of 48 trials are conducted with the subject, combining the presence or absence of motion parallax (2 types) and the presence or absence of binocular parallax (2 types) on 12 types of stimulus images. The other experimental conditions are the same as those in Sect. 3.2.

Fig. 6.
figure 6

Visual stimuli (cuboid with H:W:D = 1.0:1.0:1.8) used in the dashed edges experiment

3.5 Experimental Result of Dashed Edge Experiment

Again, we use ANOVA and Tukey’s HSD test for analysis. The experimental results are shown in Fig. 7. The vertical axis in the figure is the error of the depth perception, determined by the formula (2). In each figure, the horizontal axis is the type of cuboid, and the plotted values are the average values for each condition. Error bars represent standard errors of the mean. In the 1.8\(\times \) cuboid, the condition “reverse enhanced dashed line” is more accurate than the other conditions. As a factor, monocular static information was likely to influence the depth perception because the interaction effect of the “reverse enhanced dashed line”, “motion parallax” and “binocular parallax” is not confirmed. From the result obtained this time, we hypothesized that since the length of one side looks longer by increasing the number of elements of the dashed line near the viewpoint, the depth looks more deeper. We test this hypothesis below.

Fig. 7.
figure 7

Experimental result of the dashed edge

Figure 8(a)(b)(c) are stimulus images in which only the front and back edges of the bottom of the cuboids in Fig. 6(b)(c)(d) are drawn. Then, the subject compares the length of the line drawn on the upper side with the length of the line drawn on the lower side, and answers how many times the upper side looks similar to the length of the lower side. The subjects are 10 men and women in their 20s. In this experiment, the printouts of the stimulus images are displayed 80 cm away from the subject and parallel to the subject’s forehead.

As a result, no significant difference is confirmed between the “dashed line”, “enhanced dashed line” and “reverse enhanced dashed line”. However, each answer’s average value is 0.74, 0.76, 0.71, and the ratio of the two dashed lines is the smallest in the “enhanced dashed line” and the largest in the “reverse enhanced dashed line”. This result indicates, according our hypothesis, that the observer tends to perceive the length of the dashed line as longer when observing the dashed line that has many elements.

Fig. 8.
figure 8

Auxiliary experimental result of the dashed edge

4 Conclusion

In this study, we proposed see-through stereoscopic vision as a visualization method to support an accurate understanding of the structure of data used in digital archives. Although there is a problem that the observer underestimates the depth in see-through stereoscopic vision, we proposed a visual guide that highlights and visualizes the feature region of the 3D structure as a solution. In addition, we proposed a method to make the extracted edges dashed lines. In the visualization experiment about the dashed line, it is confirmed that proposed method can make the dashed line correctly and it is possible to distinguish the positional relationship of the edges. Next, from the evaluation experiments about edge highlighting, it became clear that edge highlighting has the effect of improving the underestimation of depth. Moreover, the effect of edge highlighting was significant especially when the motion parallax was available. However, the depth remained underestimated even after the improvement. Finally, from the evaluation experiment about the dashed line, a result that differs from our expectation of the effect of “texture gradient” is obtained. Before the experiment, we expected that the “dashed line” and “enhanced dashed line”, which incorporate the effect of “texture gradient”, among the 4 proposed enhancing methods, have improvement effects. However, as a result, the improvement effect is confirmed in the “reverse enhanced dashed line” scenario in which the effect of “texture gradient” is not incorporated. In auxiliary experiments to test hypotheses considered from these results, we obtain results tending to support the hypotheses. In the future, it is necessary to experiment with data similar to the shape of tangible cultural heritage items or with tangible cultural heritage item data.