Improving cache placement for efficient cache-based rendering

Wu, Yu-Ting; Shen, I-Chao

doi:10.1007/s00371-023-03231-z

Improving cache placement for efficient cache-based rendering

Original article
Published: 21 January 2024

(2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

The Visual Computer Aims and scope Submit manuscript

Improving cache placement for efficient cache-based rendering

Download PDF

Yu-Ting Wu¹ &
I-Chao Shen²

107 Accesses
1 Altmetric
Explore all metrics

Abstract

This paper proposes a new method to improve cache placement for various rendering algorithms using caching techniques. The proposed method comprises two stages. The first stage computes an initial cache distribution based on shading points’ geometric proximity. We present a view-guided method to cluster shading points based on their world-space positions and surface normals, while considering the camera view to avoid producing small clusters in the final image. The proposed method is more robust and easier to control than previous shading point clustering methods. After computing the shading functions at the initial cache locations, the second stage of our method utilizes the results to allocate additional caches to regions with shading discontinuities. To achieve this, a discontinuity map is created to identify these regions and used to insert new caches based on importance sampling. We integrate the proposed method into several cache-based algorithms, including irradiance caching, importance caching, and ambient occlusion. Extensive experiments show that our method outperforms other cache distributions, producing better results both numerically and visually.

Combined volume and surface rendering with global illumination caching

Article Open access 26 June 2023

A radiance cache method for highly glossy surfaces

Article 01 October 2015

Adaptive rendering based on a weighted mixed-order estimator

Article 03 May 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Rendering is a time-consuming process as it requires evaluating the high-dimensional integral of the shading function at numerous image pixels. To minimize shading expenses, a commonly employed technique is to exploit the spatial coherence of the rendered images through caching: evaluating the costly function at sparse locations known as caches, and reusing the results through interpolation or extrapolation at other surface points. Caching techniques can expedite computations of both direct and indirect illumination by storing various data such as (ir)radiance [1,2,3,4,5,6,7,8,9,10,11,12,13], visibility [14], or the importance sampling function [15, 16].

The distribution of caches has a significant impact on the final rendering quality. To accurately predict the target function, the density of caches should be proportional to the complexity of the rendered image. Regions of high complexity, such as small geometry features or shadow boundaries, require more caches to capture the rapidly changing signal. However, obtaining the optimal distribution before the actual rendering process is challenging. Therefore, some methods [1,2,3, 6, 7, 9, 12, 14] start with random or uniform caches across the image and generate new caches if reliable ones are not found in a shading point’s local neighborhood. This update-on-demand strategy, however, has two major challenges. Firstly, the update criteria are application-dependent and not all applications possess an accurate error estimation for cache interpolation. Secondly, because all the caches need to be stored in an acceleration structure such as Octree or Kd-Tree for efficient queries, any rendering-time update to the structure requires a read-write lock, which greatly reduces parallelism [4].

An alternative strategy is to improve the cache placement prior to rendering. Most methods in this category focus on specific applications [4, 5, 8, 11, 16]. A more general method in this context is spatial-directional clustering (SD) [17], which was first proposed by Ou and Pellacini to cluster shading points for virtual point light (VPL) rendering and has later been leveraged by various applications [16, 18,19,20]. The method first builds a 6D bounding volume hierarchy (BVH) to group shading points where the 6D coordinates encompass the 3D coordinate and surface normal at each shading point. The BVH is continually split until the maximum extent of the 6D bound is below a predefined threshold or the number of shading points within the bound is below a predefined threshold. Lastly, one cache is generated by sampling a shading point within a bound (shading point cluster). However, this method has two major drawbacks. Firstly, controlling the threshold for cluster splitting with a single value for both the spatial and angular properties is challenging. For example, a threshold value of one might be very small for the spatial difference in a large scene, but too large for the difference of surface normals. Secondly, selecting the threshold of cluster size for preventing small clusters faces a dilemma. As demonstrated in Fig. 1, using various configurations, the method suffers from artifacts in distant and near regions, respectively.

This paper proposes a new method for determining a good cache distribution in the early stage (i.e., prior to rendering) for various cache-based rendering techniques. The proposed algorithm comprises two stages. In the first stage, we introduce a new shading point clustering method called view-guided clustering (VGC) to generate a set of initial cache locations based on the geometric proximity of the shading points. Our method allows for more intuitive parameter control and can robustly adapt to camera view, striking a better balance in 3D geometric proximity and image-space importance. As shown in Fig. 1, our VGC produces better visual quality and lower numerical errors than existing methods.

While the proposed view-guided clustering improves cache distributions by considering geometric proximity, it does not account for changes due to shading, as shown in the blue inset of Fig. 1. To address these shading discontinuities, we devise a discontinuity-guided sampling (DGS) algorithm as the second stage building on the results obtained in the previous stage. First, we compute a discontinuity map to identify regions with potentially high shading and geometry discontinuities. Based on the discontinuity map, we then use importance sampling to deposit adaptive caches in challenging regions, such as shading edges. We redistribute a fraction of cache budget from VGC to DGS. The final rendered results using our full method (VGC+DGS) exhibit improvement in shadow regions and surface details.

The proposed method provides a better cache distribution with a small computation overhead. Through comprehensive experiments, we demonstrate that our method enhances the efficiency of cache-based applications, including irradiance caching [1], importance caching [15], and ambient occlusion rendered with visibility caches [14]. The improved cache distribution can also benefit methods using an update-on-demand strategy by offering a better initial cache distribution.

2 Related work

This section briefly reviews the existing cache-based rendering techniques. For each cached-based algorithm, we first introduce its target and then discuss its cache-placement strategy. We will focus on methods that exploit the coherence of the rendering functions between shading points. Other methods that cache light path contributions such as VPL [22] or photon mapping [23] are less relevant to our work and not included here.

Irradiance caching and radiance caching. As a pioneer of the caching techniques, Ward et al.[1] proposed irradiance caching to accelerate the computation of indirect illumination. Their method exploits the low-frequency characteristic of indirect lighting by computing the irradiance values at a set of sparse locations sampled uniformly across the image. During rendering, new caches are inserted on-demand if the potential error of interpolating existing caches is larger than a threshold. The interpolation criterion is later improved by the follow-up works, including the use of translational gradients of irradiance [2] and second-order continuity based on Hessian [6]. Tabellion and Lamorlette [3] accelerate the computation of irradiance caches with an approximate lighting model. Brouillat et al.[5] construct caches by utilizing photon maps. Gautron et al.[4] present a GPU-friendly reformulation.

The radiance caching proposed by Křivánek et al.[7] extended irradiance caching to handle glossy materials by storing directional radiance functions instead of irradiance. Then several approaches have been presented to improve the strategies for interpolation and extrapolation [8, 9, 12]. Radiance caching has also been reformulated using a probe representation [24] or pre-convolution [10, 11], or extended for rendering participating media [25, 26] and animation [27]. Recently, learning-based methods have also been explored to improve radiance cache for real-time rendering [13].

The Lumen in Unreal Engine 5 (UE5) [28] combines screen-space radiance caches [7] and world-space radiance caches [29] to render global illumination. The two types of caches are combined according to the distance of light sources. To efficiently generate caches in each frame, the screen-space caches are placed regularly in image grids with an adaptive division to capture discontinuities. The world-space radiance caches are placed uniformly in world space. Compared to our method, they combine several techniques and heavily rely on temporal filtering to achieve real-time performance, while our work concentrates on improving the cache distribution for offline rendering.

Recently path tracing has become the most popular approach due to its simplicity, generality, and unbiased characteristics. However, it often produces results with noticeable noise and requires numerous samples to converge. ReSTIR [30] and its following works [31] have greatly improved the sampling efficiency by reusing spatial-temporal samples with the reservoir resampling algorithm. Because the ReSTIR algorithms focus on improving the sampling function at a surface location, it can be combined with cache-based rendering to compute cache values at shading points [13].

Visibility caching. Clarberg and Akenine-Moeller [14] compute and store the visibility functions at sparse locations called visibility caches and use them to interpolate the visibility function at every shading point. They present two rendering applications based on visibility caches. First, they reformulate the rendering equation with the interpolated visibility function integrated into the sampling function using control variate. Secondly, they demonstrate that the visibility caches can be used to produce noise-free ambient occlusion.

Importance caching. Georgiev et al.[15] proposed importance caching to handle complex lighting and visibility in the context of VPL-based rendering. Prior to rendering, they first compute the contributions from all VPLs to a set of randomly sampled locations across the image. The cached VPL contributions are then built into several sampling distributions and combined using multiple importance sampling at each shading point during rendering. Yoshida et al.[16] later improved the idea with a better cache distribution.

To categorize the cache placement strategy, most irradiance caching [1,2,3, 6], radiance caching [7, 9, 12], and visibility caching [14] methods start with a uniform distribution and insert new caches in the course of rendering based on error estimation. For other methods that generate cache distribution prior to rendering, importance cachine [15] places caches randomly in image space. Scherzer et al.[10] adopt a fixed cache distribution based on blue noise sampling in world space. Brouillat et al.[5] generate irradiance caches on photon locations according to a photon map. Gautron et al.[4] and Křivánek et al.[8] refine uniform sampling by adding additional caches based on error estimation at radiance caches. Yoshida et al.[16] proposed a two-pass approach to improve the distribution of importance caches. We discuss the major differences between our method and their approaches in the Sect. 3.

3 Algorithm

Figure 2 demonstrates the flowchart of our method. To exploit the spatial coherence of the shading function in the rendered images, we first obtain the set of shading points by determining the 3D intersections of the scene and the camera rays. For each shading point, we store its 3D position and surface normal. Due to no shading information being available at the beginning, our first stage algorithm, view-guided clustering (VGC), utilizes the geometry proximity to compute an initial cache distribution. After computing the shading function at these cache locations, the second stage, discontinuity-guided sampling (DGS), generates additional caches by utilizing the shading information from the initial caches. The final cache distribution is then used for cache-based rendering.

3.1 First stage: view-guided clustering

The goal of the first stage is to generate a set of uniform caches based on the geometric proximity of shading points. To consider both spatial and angular properties, we first map the geometry data of a shading point onto a 5D coordinate, encompassing the 3D world-space position (x, y, z) and 2D spherical angles $\phi $ and $\theta $ for the surface normal. We then build a bounding volume hierarchy (BVH) for all shading points in a top-down fashion. Starting from the root, we iteratively split the BVH node into two children if the difference in geometric properties within the node is larger than predefined thresholds. Unlike SD [17] that uses a single threshold to control spatial and angular splitting, we adopt separate thresholds for more intuitive control: the spatial threshold $\sigma _\textrm{s}$ constrains the 3D extent of a node’s bounding box by meters, while the angular threshold $\sigma _\textrm{a}$ constrains the maximum difference of surface normals in a node by angles.

Given the 5D bounding volume of all shading points, our method first splits the BVH according to the spatial property (i.e., 3D world-space position). We replace a node with its two children if its longest world-space axis (either in x, y, or z direction) is larger than the spatial threshold $\sigma _\textrm{s}$. Next, for all leaf nodes generated by spatial splitting, we recursively replace them with their two children if the difference of surface normals in a node is larger than the angular threshold $\sigma _\textrm{a}$. Once the angular splitting process is complete, all the leaf nodes form the final shading point clusters.

For cache-based rendering, most of the computation time is spent on evaluating cache data. Consequently, making each cache effect as many shading points as possible improves rendering efficiency. In clustering-based methods such as SD [17] and our VGC, this can be implemented by avoiding generating small clusters that are projected to a few pixels in the final rendered image. Unlike SD that simply limits the cluster size with a pre-defined threshold, we propose a more robust way by adjusting the two thresholds at each node’s splitting:

$$\begin{aligned} \begin{aligned} {\sigma }'_\textrm{s} = \sigma _\textrm{s} + (1 - \alpha ) D^2 \sigma _\textrm{s}, \\ {\sigma }'_\textrm{a} = \sigma _\textrm{a} + (1 - \alpha ) D^2 \sigma _\textrm{a}, \end{aligned} \end{aligned}$$

(1)

where $D = (d - d_{{\min }})/(d_{{\max }} - d_{{\min }})$ with d, $d_{{\min }}$, and $d_{{\max }}$ being the depth of the node and the minimum and maximum depth of all shading points, respectively. $\alpha $ is a parameter that controls the maximum scaling of the thresholds. We set $\alpha $ to $(1 + \Gamma (d_{{\max }} - d_{{\min }}))$ with the parameter $\lambda $ controlling the decaying of importance with respect to depth. We use $\Gamma = 0.2$ for all scenes in this paper. Equation 1 uses the original thresholds $\sigma _\textrm{s}$ and $\sigma _\textrm{a}$ for the shading points with depth $d_{{\min }}$ and $\alpha \cdot \sigma _\textrm{s}$ and $\alpha \cdot \sigma _\textrm{a}$ for the ones with depth $d_{{\max }}$. For the shading points with depth in between, the thresholds are scaled proportional to the squared distance to the camera because the projected area is inversely proportional to the square of the distance.

Once the clustering is done, a cache is generated by selecting a member in the cluster. We first compute the average 3D coordinate $p_\textrm{avg}$ and surface normal $n_\textrm{avg}$ of all members in the cluster, with a small jittering to avoid structural artifacts. Then the closest shading point to the average geometric properties is chosen as the cache location based on the distance metric:

$$\begin{aligned} \text {Dist}(s, avg) = \left\| p_\textrm{s} - p_\textrm{avg} \right\| + \lambda \sqrt{1 - n_\textrm{s} \cdot n_\textrm{avg}}, \end{aligned}$$

(2)

where the $p_\textrm{s}$ and $n_\textrm{s}$ denote the 3D coordinate and surface normal of the shading point, $\lambda $ denotes a parameter that trades off the importance of Euclidean and angular distance. We set $\lambda $ to 0.5/L where L is the diagonal of the scene’s bounding box [15].

In Fig. 1, we compare the cache distributions generated by our method and spatial-directional BVH (SD) [17]. We also show the image-space uniform sampling (IM) for comparison. The images are ambient occlusion results rendered with the corresponding cache distribution and the VisibilityCache algorithm (presented in Sect. 4.3). IM disregards geometric properties and as a result, it produces significant errors on curved surfaces, such as the two Buddhas. For SD, we tested it using various constraints on the minimum size of a shading cluster. The results indicated that a large cluster size (50 for SD-50) failed to capture the high-frequency details of the distant Buddha. Conversely, a smaller cluster size (10 for SD-10) placed too many caches on distant surfaces, leading to insufficient cache density in nearby regions. The red and green insets of SD-50 and the blue and yellow insets of SD-10 in Fig. 1 demonstrate such scenarios, respectively. Our view-guided clustering, VGC, strikes a better balance between the 3D geometric proximity and image-space importance.

3.2 Second stage: discontinuity-guided sampling

The distribution in the first stage only considers geometric properties. Thus, it cannot capture discontinuities due to shading. One example is the shadows below the front Buddha disappeared as shown in Fig. 1. For this reason, we devise a second stage to detect regions with complex shading and redistribute a fraction of samples to these regions.

To collect shading information, we first compute the target shading functions (ambient occlusion, irradiance value, importance function, etc.) at the initial cache locations based on various caching-based applications. For each cache, in addition to storing the function value, we also store the individual application-dependent gathering sample to provide more information on local shading behavior:

Irradiance caching. The irradiance value at a cache is computed by sampling the radiance from a set of directions using the equal-area spherical mapping [32]. At each cache, we store the accumulated irradiance value as well as the individual radiance value along each sampling direction.

Importance caching. We store the accumulated and individual contributions from all virtual lights at each cache as in the original method [15].

Ambient occlusion. The ambient occlusion value is computed by sampling the visibility function with a set of shadow rays using the equal-area spherical mapping [32]. At each cache, we store the ambient occlusion value as well as the individual visibility result in a binary bitmask.

It is worth noting that this process only introduces minimal memory usage without additional computational cost.

After the shading function is computed at the sparse caches, we create a per-pixel discontinuity map based on the gathering samples to identify regions with large shading discontinuities. For the shading point corresponding to each pixel, we search its K nearest caches and compute the average difference $\text {Diff}_\textrm{avg}$ and the maximal difference $\text {Diff}_{\max }$ between these K caches. The difference between two caches is computed by summing the absolute difference of their corresponding gathering samples. $\text {Diff}_\textrm{avg}$ and $\text {Diff}_{\max }$ tend to be higher in regions with complex shading, such as shadow boundaries. In addition to shading discontinuities, we observed that the first stage may struggle to identify the discontinuities of the shading function at regions with small geometry features, especially when the local density of caches is insufficient. For example, VGC fails to render the ambient occlusion on the cabinets correctly in Bathroom shown in Fig. 6. To address this problem, we include a geometry term G to detect small geometry discontinuities at a shading point s:

$$\begin{aligned} G(s) = \sum _{q \in \Omega } (1 - n_\textrm{s} \cdot n_{q}), \end{aligned}$$

(3)

where q is a nearby shading point within a local window of s. n is the surface normal of the shading points. We use a $5 \times 5$ block for the local window. Putting all terms together, the per-pixel discontinuity measurement D(s) of a shading point s is then defined as:

$$\begin{aligned} D(s) = \text {Diff}_\textrm{avg}(s) + \text {Diff}_{\max }(s) + G(s). \end{aligned}$$

(4)

$\text {Diff}_\textrm{avg}(s)$, $\textrm{Diff}_{\max }(s)$, and G(s) are normalized to the range $\left[ 0, 1\right] $. To prevent creating caches too close to an existing one, we set the D(s) of a shading point to zero if it has already been chosen as a cache.

To efficiently generate a set of caches based on the discontinuity map, we first importance sample a row of the map using its marginal density. After choosing a row, we importance sample a pixel within the row using conditional probability. The selected pixel becomes a new cache. Figure 3 demonstrates the discontinuity map, adaptive cache locations, and visual comparisons of using VGC-only and VGC+DGS on the Statues scene (left) and the Robots scene (right). In the left part of the image, the shadow regions on the objects’ surface and ground produce larger discontinuity values in the discontinuity map, thus receiving more adaptive caches. The insets showcase that the artifacts in shadow regions are reduced thanks to the increasing cache density. In the right part of the image, VGC produces large errors on the robots’ surface because its geometry-based clustering fails to capture the discontinuities of glossy shading. The discontinuity map captures the large shading discontinuities and places more adaptive caches on the robots’ surfaces. The insets showcase noise reduction by incorporating the DGS stage.

There are a few methods that also adopt a two-pass strategy to improve the cache distribution [8, 16]. They first compute a set of initial caches based on uniform sampling or shading point clustering [17]. Then they loop over the initial caches and examine how they differ from the neighboring caches. New caches are inserted if the estimated error is larger than a threshold. Because the refinements are performed per cache, we observed that their approaches might fail to capture the shading discontinuities when the initial caches are insufficient. Moreover, selecting the thresholds for inserting new caches is not intuitive because they are application-dependent and scene-dependent. For example, in our experiment the threshold for SD+AC [16] to generate 2.5K adaptive caches for Statues, Robots, and the CrytekSponza scene in Fig. 4 are 6000, 93000, and 60, respectively. Compared to these methods, our view-guide clustering is more effective than uniform sampling and the spatial-directional clustering they used in the first pass. Our importance sampling approach offers a united manner to generate new caches at pixels with relatively large potential errors.

3.3 Run-time rendering

During rendering, we employ a strategy similar to the one used in VisibilityCache [14]. In order to find a small number of caches M for estimating the target function for a shading point s, we first perform a range search in the acceleration structure and locate N spatially nearest caches in the world space. After filtering out the ones with dissimilar surface normals with a threshold $\theta _{\max }$, we compute the distance $\text {Dist}(c^{i}, s)$ between a remaining cache $c^{i}$ and the shading point using:

$$\begin{aligned} \text {Dist}(c^{i}, s) = \left\| p^{i}_\textrm{c} - p_\textrm{s} \right\| + \lambda \sqrt{1 - n^{i}_\textrm{c} \cdot n_\textrm{s}}, \end{aligned}$$

(5)

where $p^{i}_\textrm{c}$, $p_\textrm{s}$, $n^{i}_\textrm{c}$, and $n_\textrm{s}$ are the 3D coordinates and surface normals of the cache $c^{i}$ and the shading point s, respectively. We decide the value of $\lambda $ according to Eq. 2. We only use the M caches with the smallest distance for the subsequent computation. For applications that directly interpolate cache values such as irradiance caching (Sect. 4.1) and ambient occlusion (Sect. 4.3), the value at the shading point is computed by weighted averaging the M cache values, with the inverse of the distance defined in Eq. 5 as the weight. For importance caching (Sect. 4.2), we apply the multiple importance sampling presented in the original paper [15] using the M caches. We set N, M, and $\theta _{\max }$ to 20, 6, and $30^{\circ }$ for all applications and results in this paper. If no caches remain after filtering, we will loosen the constraints and use all N caches as candidates for selecting M among them.

4 Experiments

We integrate our method into three cache-based techniques: irradiance caching [1], importance caching [15], and ambient occlusion using visibility caching [14], with implementations on top of the PBRT3 system [33]. We conduct equal-cache comparisons on various cache distributions, including image-space uniform sampling (IM), spatial-directional BVH (SD) [17], our view-guided clustering (VGC) only, and our full two-stage method (VGC+DGS). We also compare to the adaptive sampling approach proposed by Yoshida et al.[16] (SD+AS) although the method was initially designed for importance caching. All images in this paper are rendered in 1920 $\times $ 1080 resolutions with 4 sub-pixel samples for antialiasing, generated on a desktop with an Intel i7-10700 CPU at 2.90 GHz, 72GB of RAM, and using 16 threads. Reference images are rendered by computing the shading function at all pixels. We evaluate all methods using root-mean-square error (RMSE) and a perceptual-based metric, LIP [21].

Table 1 Parameters used in the paper

Full size table

Parameters. Table 1 lists the parameters and their values used in our method. Among them, N, M, and $\theta _{{\max }}$ are used by all cache-based methods. Default values work well for these parameters and for $\Gamma $ and R of our method. $\sigma _\textrm{s}$ and $\lambda $ can be derived automatically. The only two parameters to determine are the total number of caches $N_\textrm{c}$ and the angular threshold $\sigma _\textrm{a}$. For the angular threshold $\sigma _\textrm{a}$, we suggest setting it according to the total target number of caches $N_\textrm{c}$. We use $15^{\circ }$ for 120K caches and $25^{\circ }$ for 25K and 40K caches, respectively. Given the $\sigma _\textrm{a}$, our method then determines the spatial threshold $\sigma _\textrm{s}$ to meet the total number of caches using a binary search process. Unless $N_\textrm{c}$ or $\sigma _\textrm{a}$ changes, the binary search process only has to be performed once.

For SD, we tune the threshold value for BVH splitting to meet the desired number of caches $N_\textrm{c}$. We also tested various constraints for the minimum cluster size and selected the one that resulted in the lowest error. If the RMSE and LIP values conflict, we choose the size that yields better visual quality.

For SD+AS and our VGC+DGS, we redistribute a ratio R of the cache budgets $N_\textrm{c}$ from the first stages (SD and VGC, respectively) to the second stage. We use $R=15\%$ for all scenes and applications in this paper.

4.1 Irradiance caching

We compute the irradiance values at the locations of caches using 4, 096 sampling rays generated by equal-area spherical mapping [32] prior to rendering. During rendering, we first estimate the irradiance value at a shading point with the procedure introduced in Sect. 3.3, then the irradiance value is multiplied by a directional reflectance term for supporting moderate glossy materials.

Figure 4 shows the equal-sample comparisons (40K caches) of the indirect lighting results rendered with various cache distributions. For all scenes, IM blurs out the geometry details on objects with complex shapes. It also produces large errors on objects with fine geometry such as iron chains in the CrytekSponza, and in distant objects like the curtains in Conference. By considering the geometric properties of shading points, SD, SD+AS, and our methods (VGC and VGC+DGS) improve the shading on curved surfaces. In CornellBox, the differences between the results produced by SD, SD+AS, VGC, and VGC+DGS are subtle due to a simple layout. SD+AC,VGC, and VGC+DGS produce better results on the face of the Buddha model (red inset). The scene layout in CrytekSponza shows a difficult case for SD and SD+AS. They place too many samples on the curved surface, resulting in apparent artifacts for the shadows on the ground due to insufficient caches. Although IM produces the lowest perceptual error, it produces incorrect highlights on the iron chain and produced the highest RMSE. In Conference, SD and SD+AS smooth out the details of the distant bunny and produce artifacts on the curtains by using a larger cluster size to reduce the global error. On the other hand, VGC and VGC+DGS obtain lower errors while preserving the rendering qualities of the distant bunnies and curtains. This demonstrates that our method strikes a better balance for allocating samples under various scene conditions. Our full method does not produce apparent improvements over VGC alone in irradiance caching. The reason is that the indirect illumination is relatively smooth, thereby the re-allocation of samples does not produce significant impacts.

4.2 Importance caching

Importance caching [15] computes global illumination based on the virtual point light formulation [22]. We convert direct and indirect illumination into approximately 8K virtual point lights. At each cache location, we compute the contributions from all virtual lights and build four importance sampling distributions, ranging from aggressive to conservative. During rendering, the sampling distributions stored at the nearest M caches are combined with a bilateral multiple-importance sampling scheme at each shading point to generate 4 light samples. Based on the finding of the ablation study discussed in Sect. 4.4, we exclude the geometry term in Eq. 4 in importance caching. Figure 5 demonstrates four example scenes rendered by importance caching using 25K caches, with various cache distributions. Disregarding the geometric properties, IM produces the most noise on fine geometry among all methods. Similar to the case of CrytekSponza in the application of irradiance caching, SD and SD+AS face difficulties when used in Sibenik because they fail to strike a balance between spatial and angular properties. As a result, they produce noticeable artifacts on the stairs due to poor cache density. Thanks to the better cache location, our method offers more accurate sampling functions and reduces the noises and artifacts globally. In Robots scene, IM produces more noises in distant regions due to a lower cache density. SD and SD+AC suffer from larger noises in shadows. VGC produces a larger error on the robot’s glossy surfaces due to the complex shading on simple geometry. The DGS stage in our full method compensates for the insufficient cache density on the robot’s surface and produces better results.

4.3 Ambient Occlusion

Visibility caches have been shown that can be used to compute noise-free ambient occlusion with a small fraction of computation [14]. In our implementation, we trace 1024 shadow rays generated by the equal-area spherical mapping [32] to compute the ambient occlusion value at cache locations. Figure 6 demonstrates the equal-sample comparisons of various cache distributions. We use 50K caches for Statues and 120K caches for Bathroom, Conference, and Kitchen. Disregarding scene geometry, IM produces the largest error in all test cases. SD and SD+AS outperform IM for taking geometry into consideration. However, they tend to spend too much effort on curved surfaces, leading to insufficient cache density in the spatial domain. VGC shows great improvement over SD in Bathroom and Conference. Nonetheless, it struggles to capture the occlusion changes due to fine geometry, resulting in larger errors on the cabinets of Kitchen. This issue is addressed by introducing DGS, which improves the image quality on geometry details, such as the cabinets in the Kitchen and Bathroom. In Statues, SD, SD+AC, and VGC all produce artifacts near the shadows on the ground because they cannot capture the shading change on simple geometry. Our full method reduces the artifacts of shadows and produces the most visually pleasing result.

4.4 Ablation Studies

Table 2 Comparisons of using various components in the discontinuity-guided sampling for ambient occlusion (AO), irradiance caching (IrC), and importance caching (ImC). DGS (diff) and DGS (geom) stand for the settings using the difference between the application-dependent samples and the geometry term for constructing the discontinuity map, respectively. DGS (all) uses both terms. We also compare with the configuration that combines our view-guided sampling with adaptive sampling [16] (VGC+AS)

Full size table

Various components v.s. accuracy. In Table 2, we present the performances of various components used in discontinuity-guided sampling. The setting that only uses the difference between the application-dependent samples for constructing the discontinuity map is denoted by DGS (diff), while DGS (geom) uses the geometry term only. The full method that combines both terms is denoted by DGS (full). To compare the effectiveness of discontinuity-guided sampling with adaptive sampling [16], we also include a configuration that combines our view-guided clustering and adaptive sampling denoted by VGC+AS. For irradiance caching and ambient occlusion, the experiment results demonstrate that the difference between the application-dependent samples and the geometry term is crucial for various scenes. The geometry term plays an essential role in scenes containing rich small geometry features such as Bathroom, while the difference term has a more significant impact in scenes with many flat surfaces such as Conference. Our full method effectively combines the strength of both terms. We discovered that including the geometry term would not lead to better results for importance caching. One possible reason is that the cached values are used as sampling functions instead of direct interpolation for a pixel value. This implies that the advantage of incorporating the geometry term is diminished. In all applications, view-guided clustering combined with discontinuity-guided sampling outperforms the one combined with adaptive sampling (VGC+AS).

Number of caches v.s. accuracy. Figure 7 demonstrates the relationship between the number of caches and different metrics (RMSE and LIP) on Staircase rendered with ambient occlusion. Our method VGC consistently produces superior results than previous methods, ranging from preview quality using 10K caches to high-quality rendering using 100K caches. Furthermore, the figure highlights that our two-stage method (VGC+DGS) reduces numerical errors by utilizing partial rendering results.

Parameters v.s. accuracy. We conducted an experiment to investigate how various parameters affect the numerical errors rendered with SD and our VGC using 25K importance caches on Sibenik scene. For SD, we adjust the minimum cluster size which controls the relative importance between world-space geometric proximities and image-space importance. For VGC, we altered the angular threshold, which influences the angular and spatial proximity. Using a smaller angular threshold results in more caches allocated on curved surfaces, at the expense of sparser samples in the spatial extent. The results presented in Fig. 8 indicate that VGC consistently produces better results than SD over a wide range of parameters. Furthermore, VGC is less sensitive to parameter selection than SD, as it exhibits less fluctuation in the results.

Sample budgets for VGC and DGS. To investigate the appropriate sample budgets for VGC and DGS, we render the ambient occlusion of the Staircase scene with 80K caches and compute global illumination of the DiningRoom scene with 25K importance caches, using various ratios of samples moved from VGC to DGS. The results are presented in Table 3. Although the discontinuity-guided sampling pass does improve the results compared to using VGC-only (0% in the table), it relies on the information from the first stage (i.e., VGC). Therefore, an excessively high allocation of samples to DGS could lead to sub-optimal results. Our experiment suggests that the best ratio for the sample budget is between $10\%$ and $20\%$.

4.5 Performance profiling

Table 3 Comparisons of the various fractions of total cache budgets moved from VGC to DGS. The last column (0%) means using VGC only. According to the experiment results, we suggest using a ratio between 10% to 20%

Full size table

Table 4 Time (in seconds) profiling of each step in various methods. The initial cache computation time of our method is shorter than the other single-pass methods (IM and SD) because we redistribute 15% cache budget to the second (adaptive) stage. The adaptive cache generation time of our method includes the time to build the discontinuity map and draw adaptive samples using importance sampling

Full size table

Table 4 demonstrates the detailed timing of each step of the various methods on CrytekSponza (irradiance caching), Sibenik (importance caching), and Bathroom (ambient occlusion). Compared to SD, our method can produce better results with negligible overheads in all applications. Compared to IM, the overheads to improve the cache distributions in importance caching and ambient occlusion are slightly larger than in irradiance caching because the computation of cache data is less expensive. However, considering the visual quality improvements over IM in Fig. 5 and Fig. 6, and the numerical error improvements demonstrated in the ablation study: number of caches v.s. accuracy (Fig. 7), it is worth improving the cache distribution with the overhead.

4.6 Limitations and future work

Our discontinuity-guided sampling method is effective, but it combines cache differences and small geometry features using a heuristic approach. As a result, a large geometry term might lead to oversampling of regions where the differences between caches are already small. Although we can introduce a hyperparameter to emphasize the relative importance of the two terms, this control is not straightforward. Therefore, we are interested in investigating whether a data-driven approach utilizing the computation results obtained in the first stage can improve the performance.

Another interesting future direction is to extend our method to the temporal domain for rendering animations. We are planning to design criteria for reusing cache across frames. Finally, we would also like to develop a GPU version of the proposed method for real-time rendering.

5 Conclusion

In this paper, we propose a new algorithm to improve the sample distribution for cache-based rendering. Our method comprises two stages. In the first stage, an initial cache distribution is determined based on a new shading point clustering algorithm called view-guided clustering. The new algorithm is more robust for striking a better balance between geometric proximities and image-space importance. It is more intuitive in parameter settings than in previous methods. Our second stage further improves the cache distribution by utilizing the computation results at the initial caches. It allocates additional caches in regions with complex shading. The result is an efficient algorithm for determining cache locations for cache-based techniques. Experiments show the proposed method offers significant improvements over other strategies for generating caches.

Data Availability Statement

The source code and test scenes used in this work are available from the corresponding author on reasonable request.

References

Ward, G.J., Rubinstein, F.M., Clear, R.D.: A ray tracing solution for diffuse interreflection. In: Proceeding of the SIGGRAPH, pp. 85–92 (1988). https://doi.org/10.1145/378456.378490
Ward, G., Heckbert, P.: Irradiance gradients. In: Proceedings of the Eurographics Workshop on Rendering, pp. 85–98 (1992). https://doi.org/10.1145/1401132.1401225
Tabellion, E., Lamorlette, A.: An approximate global illumination system for computer generated films. ACM Trans. Graph. Proc. SIGGRAPH 23(3), 469–476 (2004). https://doi.org/10.1145/1186562.1015748
Article Google Scholar
Gautron, P., Krivánek, J., Bouatouch, K., Pattanaik, S.: Radiance cache splatting: A GPU-friendly global illumination algorithm. In: Proceedings of the Eurographics Symposium on Rendering (2005). https://doi.org/10.1145/1187112.1187154
Brouillat, J., Gautron, P., Bouatouch, K.: Photon-driven irradiance cache. Comput. Graphic. Forum 27(7), 1971–1978 (2008). https://doi.org/10.1111/j.1467-8659.2008.01346.x
Article Google Scholar
Schwarzhaupt, J., Jensen, H.W., Jarosz, W.: Practical hessian-based error control for irradiance caching. ACM Trans. Graph. (Proc. SIGGRAPH) (2012) https://doi.org/10.1145/2366145.2366212
Křivánek, J., Gautron, P., Pattanaik, S., Bouatouch, K.: Radiance caching for efficient global illumination computation. IEEE Trans. Visual. Comput. Gr. 11(5), 550–561 (2005). https://doi.org/10.1109/TVCG.2005.83
Article Google Scholar
Křivánek, J., Bouatouch, K., Pattanaik, S., Žára, J.: Making radiance and irradiance caching practical: Adaptive caching and neighbor clamping. In: Proceedings of the Eurographics Symposium on Rendering, pp. 127–138 (2006). https://doi.org/10.2312/EGWR/EGSR06/127-138
Gassenbauer, V., Krivanek, J., Bouatouch, K.: Spatial Directional Radiance Caching. Comput. Gr. Forum (2009). https://doi.org/10.1111/j.1467-8659.2009.01496.x
Article Google Scholar
Scherzer, D., Nguyen, C.H., Ritschel, T., Seidel, H.-P.: Pre-convolved radiance caching. Comput. Gr. Forum 31(4), 1391–1397 (2012). https://doi.org/10.1111/j.1467-8659.2012.03134.x
Article Google Scholar
Rehfeld, H., Zirr, T., Dachsbacher, C.: Clustered pre-convolved radiance caching. In: Proceedings of the Eurographics Symposium on Parallel Graphics and Visualization, pp. 25–32 (2014). https://doi.org/10.2312/pgv.20141081
Zhao, Y., Belcour, L., Nowrouzezahrai, D.: View-dependent radiance caching. In: Proceedings of the Graphics Interface (2019). https://doi.org/10.20380/GI2019.22
Müller, T., Rousselle, F., Novák, J., Keller, A.: Real-time neural radiance caching for path tracing. ACM Trans. Graph. (Proc. SIGGRAPH) 40(4) (2021) https://doi.org/10.1145/3450626.3459812
Clarberg, P., Akenine-Moeller, T.: Exploiting visibility correlation in direct illumination. In: Computer Graphics Forum (Proc. EGSR), pp. 1125–1136 (2008). https://doi.org/10.1111/j.1467-8659.2008.01250.x
Georgiev, I., Krivánek, J., Popov, S., Slusallek, P.: Importance caching for complex illumination. Comput. Graphics Forum (Proc. Eurographics) 31(2), 701–710 (2012) https://doi.org/10.1111/j.1467-8659.2012.03049.x
Yoshida, H., Nabata, K., Iwasaki, K., Dobashi, Y., Nishita, T.: Adaptive importance caching for many-light rendering. J. WSCG 23(1), 65–71 (2015)
Google Scholar
Ou, J., Pellacini, F.: LightSlice: Matrix slice sampling for the many-lights problem. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 30(6) (2011) https://doi.org/10.1145/2070781.2024213
Wu, Y.-T., Chuang, Y.-Y.: VisibilityCluster: Average directional visibility for many-light rendering. IEEE Trans. Vis. Comput. Gr. 19(9), 1566–1578 (2013). https://doi.org/10.1109/TVCG.2013.21
Article Google Scholar
Wu, Y.-T., Li, T.-M., Lin, Y.-H., Chuang, Y.-Y.: Dual-matrix sampling for scalable translucent material rendering. IEEE Trans. Visual Comput. Gr. 21(3), 363–374 (2015). https://doi.org/10.1109/TVCG.2014.2385059
Article Google Scholar
Vévoda, P., Kondapaneni, I., Křivánek, J.: Bayesian online regression for adaptive direct illumination sampling. ACM Trans. Graph. (Proc. SIGGRAPH) 37(4) (2018) https://doi.org/10.1145/3197517.3201340
Andersson, P., Nilsson, J., Shirley, P., Akenine-Möller, T.: Visualizing errors in rendered high dynamic range images. In: Eurographics Short Papers (2021). https://doi.org/10.2312/egs.20211015
Keller, A.: Instant radiosity. In: Proceedings of the SIGGRAPH, pp. 49–56 (1997). https://doi.org/10.1145/258734.258769
Jensen, H.W.: Global illumination using photon maps. In: Rendering Techniques, pp. 21–30 (1996). https://doi.org/10.1007/978-3-7091-7484-5_3
Silvennoinen, A., Lehtinen, J.: Real-time global illumination by precomputed local reconstruction from sparse radiance probes. ACM Trans. Graph. 36(6) (2017) https://doi.org/10.1145/3130800.3130852
Jarosz, W., Donner, C., Zwicker, M., Jensen, H.W.: Radiance caching for participating media. ACM Trans. Graph. 27(1) (2008) https://doi.org/10.1145/1330511.1330518
Marco, J., Jarabo, A., Jarosz, W., Gutierrez, D.: Second-order occlusion-aware volumetric radiance caching. ACM Trans. Graph. 37(2) (2018) https://doi.org/10.1145/3185225
Dubouchet, R.A., Belcour, L., Nowrouzezahrai, D.: Frequency based radiance cache for rendering animations. In: Computer Graphics Forum (Proc. EGSR) (2017). https://doi.org/10.2312/sre.20171193
Patry, J., Wright, D., Halen, H., Hayward, K., Brinck, A., Bei, X.: Advances in real-time rendering in games, Part 2. SIGGRAPH Course (2021). https://advances.realtimerendering.com/s2021/
McLaren, J.: The Technology of The Tomorrow Children. Game Developers Conference (GDC) (2015). https://www.gdcvault.com/play/1022428/The-Technology-of-The-Tomorrow
Bitterli, B., Wyman, C., Pharr, M., Shirley, P., Lefohn, A., Jarosz, W.: Spatiotemporal reservoir resampling for real-time ray tracing with dynamic direct lighting. ACM Trans. Graph. (Proc. SIGGRAPH) 39(4) (2020) https://doi.org/10.1145/3386569.3392481
Lin, D., Kettunen, M., Bitterli, B., Pantaleoni, J., Yuksel, C., Wyman, C.: Generalized resampled importance sampling: Foundations of restir. ACM Trans. Graph. (Proc. SIGGRAPH) 41(4) (2022) https://doi.org/10.1145/3528223.3530158
Clarberg, P.: Fast equal-area mapping of the (Hemi)sphere using SIMD. J. Graphics Tools 13(3), 53–68 (2008). https://doi.org/10.1080/2151237X.2008.10129263
Article Google Scholar
Pharr, M., Jakob, W., Humphreys, G.: Physically Based Rendering: From Theory to Implementation, 3rd edn., p. 1266. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2016)
Google Scholar
Bitterli, B.: Rendering resources. https://benedikt-bitterli.me/resources/ (2016)
McGuire, M.: Computer Graphics Archive. https://casual-effects.com/data

Download references

Acknowledgements

We thank the anonymous reviewers for their valuable comments, and the creators or providers of the models and textures used in this paper: Staircase, CornellBox, DiningRoom, Bathroom, and Kitchen, via Benedikt Bitterli’s rendering resources [34]; CrytekSponza, Conference, and Sibenik from McGuire Computer Graphics Archive [35]; This work was supported in part by the National Science and Technology Council (NSTC) under grants 111-2222-E-305-001-MY2 and JSPS Grant-in-Aid JP23K16921.

Author information

Authors and Affiliations

Computer Science Department, National Taipei University, Taipei, Taiwan
Yu-Ting Wu
Computer Science Department, The University of Tokyo, Tokyo, Japan
I-Chao Shen

Authors

Yu-Ting Wu
View author publications
You can also search for this author in PubMed Google Scholar
I-Chao Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu-Ting Wu.

Ethics declarations

Conflicts of Interest

The following are potential conflicts of interest: $\bullet $ Yung-Yu Chuang (https://www.csie.ntu.edu.tw/~cyy/) $\bullet $ Tzu-Mao Li (https://cseweb.ucsd.edu/~tzli/)

Research Involving Human Participants and/or Animals

This research did not involve human participants.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, YT., Shen, IC. Improving cache placement for efficient cache-based rendering. Vis Comput (2024). https://doi.org/10.1007/s00371-023-03231-z

Download citation

Accepted: 14 December 2023
Published: 21 January 2024
DOI: https://doi.org/10.1007/s00371-023-03231-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Improving cache placement for efficient cache-based rendering

Abstract

Similar content being viewed by others

Combined volume and surface rendering with global illumination caching

A radiance cache method for highly glossy surfaces

Adaptive rendering based on a weighted mixed-order estimator

1 Introduction

2 Related work