Fast Topology-Based Feature Tracking using a Directed Acyclic Graph

Saikia, Himangshu; Weinkauf, Tino

doi:10.1007/978-3-030-43036-8_10

Himangshu Saikia⁹ &
Tino Weinkauf⁹

Part of the book series: Mathematics and Visualization ((MATHVISUAL))

Included in the following conference series:

Topological Methods in Data Analysis and Visualization

573 Accesses

Abstract

We present a method for tracking regions defined by Merge trees in time-dependent scalar fields. We build upon a recently published method that computes a directed acyclic graph (DAG) from local tracking information such as overlap and similarity, and tracks a single region by solving a shortest path problem in the DAG. However, the existing method is only able to track one selected region. Tracking all regions is not straightforward: the naïve version, tracking regions one by one, is very slow. We present a fast method here that tracks all regions at once. We showcase speedups of up to two orders of magnitude.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Computing and Visualizing Time-Varying Merge Trees for High-Dimensional Data

Fast Similarity Search in Scalar Fields using Merging Histograms

Jacobi set simplification for tracking topological features in time-varying scalar fields

Article 05 June 2024

1 Introduction

We are concerned with the tracking of regions defined by merge trees. In [12], we devised a method that tracks the superlevel or sublevel sets of a scalar field as defined by the subtrees of the merge tree. However, once these regions have been extracted in each time step, we neglect their origin 1 and record tracking information such as overlap and histogram similarity in a directed acyclic graph (DAG). Its nodes are the regions (Figs. 1 and 2). Overlapping and similar regions in consecutive time steps are connected by an edge, weighted by the amount of overlap and similarity. We solve a shortest path problem to track a region over time. This global approach to tracking prevents the issue with only local decisions as shown in Fig. 1.

In [12], we present, among other things, a method for tracking a single region using the DAG. This is done by computing shortest paths to all reachable sources and sinks from a given node and combining those two paths. This however, is not how one might define a shortest path via a node. In this paper, we define a shortest path as the path with the least objective function value out of all paths starting at a source, going through the given node and ending at a sink. An objective function can be any function which assigns a score to a path based on how well it represents the evolution of a particular feature along that path. Using this definition of a shortest path, the previous method of combining backward and forward shortest paths may not work.

In this work, we extend the previous work and present a non-trivial solution to tracking all regions from all time steps, i.e., a method for extracting all feature tracks. The trivial solution is to iterate over all nodes of the DAG and execute the single region tracking algorithm from [12]. However, we will show in this paper how this leads to very long running times. Our approach is up to two orders of magnitude faster. Our method employs a shortest path algorithm but is quite different from the standard Djikstra’s algorithm or Floyd-Warshall’s algorithm to compute all pairs shortest paths. Our DAG being structured in a way that only temporal edges—edges between nodes of two successive time steps—exist, facilitates better runtime bounds than standard algorithms.

2 Related Work

The sheer size of time-dependent data sets often necessitates a data reduction step before an efficient analysis can take place. It is therefore a common approach to extract and track features.

Many methods track topological structures. Tricoche et al. [23] track critical points and other topological structures in 2D flows by exploiting the linearity of the underlying triangle grid. Garth et al. [4] extend this to 3D flows. Theisel and Weinkauf [18, 26] developed feature flow fields as a general concept to track many different features independent of the underlying grid. Reinighaus et al. [11] extended this idea to the discrete setting.

In the area of time-dependent scalar fields several methods exist to track and visualize topological changes over time. Samtaney et al. [15] provides one of the first algorithms to track regions in 3D data over time using overlap. Kettner et al. [7] presents a geometric basis for visualization of time-varying volume data of one or several variables. Szymczak [17] provides a method to query different attributes of contours as they merge and split over a certain time interval. Sohn and Bajaj [16] presents a tracking graph of contour components of the contour tree and use it to detect significant topological and geometric evolutions. Bremer et al. [2] provide an interactive framework to visualize temporal evolution of topological features.

Other methods for tracking the evolution of merge trees such as the method due to Oesterling [8] track changes to the hierarchy of the tree. This comes at the price of a very high computation time. Its runtime complexity is polynomial in the data size, more precisely, it is O(n ³) with n being the number of voxels. However, the method tracks the unaugmented (full) merge tree instead of just critical points or super-arcs.

Vortex structures are another important class of features that can be tracked in time-dependent flows. Reinders et al. [10] track them by solving a correspondence problem between time steps based on the attributes of the vortices. Bauer and Peikert [1] and Theisel et al. [19] provide different methods for tracking vortices defined by swirling stream lines. This notion was extended later to include swirling path lines [25], swirling streak and time lines [27], swirling trajectories of inertial particles [5] and rotation invariant vortices [6].

Pattern matching has been originally developed in the computer vision community. A number of visualization methods have been inspired by that. Examples are pattern matching methods for vector fields based on moment invariants as proposed by Bujack et al.[3], or pattern matching for multi-fields based on the SIFT descriptor as proposed by Wang et al. [24].

A similar line of research, but technologically rather different, is the analysis of structural similarity in scalar fields, which gained popularity recently. Thomas and Natarajan detect symmetric structures in a scalar field using either the contour tree [20], the extremum graph [21], or by clustering contours [22]. Saikia et al. compared merge trees by means of their branch decompositions [13] or by means of histogram over parts of the merge tree [14]. Our method outputs a set of best tracks of topologically segmented structures in a spatio-temporal setting, and enables an all-to-all temporal pattern matching scheme using techniques like dynamic time warping.

3 Method

In the following, we will first briefly recapitulate the tracking method for single regions of [12], and then present our new and fast approach for tracking all regions.

3.1 Tracking Merge Tree Regions using a Directed Acyclic Graph

Given is a time-dependent scalar field. It could have any number of spatial dimensions, our implementation supports 2D and 3D. A merge tree is computed from each time step independently. After an optional simplification, all subtrees (as defined in [13]) are converted into a set of nodes to be used within the Directed Acyclic Graph (DAG). They represent the components of the superlevel or sublevel sets of the scalar field and are continuous regions in the domain.

All overlapping nodes from consecutive time steps are connected via edges in the DAG. Their weights represent local tracking information in the sense that a lower edge weight indicates a higher likelihood for the two connected regions to be part of the same track. We use a linear combination of volume overlap and a histogram difference to compute these weights. The volume overlap distance d _o between two non-empty regions and is determined from the number of voxels they have in common and the total number of voxels covered by both regions:

(1)

The Chi-Squared histogram distance (see e.g. [9]) between two regions is defined as

(2)

where h _a,i and h _b,i denote the bins of the histograms h _a and h _b, respectively. Here the histograms represent the number of vertices encapsulated by a region as described in [14].

Our combined distance measure for an edge is given by d = λd _s + (1 − λ)d _o where λ ∈ [0, 1] is a tunable parameter. It is now possible to use this DAG for the next step as is, or it can be further thresholded to weed out extremely large weighted edges (For instance in Fig. 3a the edges between the green and pink nodes are removed).

We track a region by solving a shortest path problem with Dijkstra’s algorithm on the DAG. The method in [12] does this for one region at a time. From the selected region, a shortest path is found to a source in an earlier time step, and another shortest path is found to a sink in a later time step. Combining these paths yields the track of the region. We describe a path to be a set of successive directed edges in the graph. Since there exists only a single directed edge between any two nodes in the graph, a path can also be described by all successive nodes that connect these edges. Source and sink refer in this context to nodes that have no incoming or outgoing edges, respectively. We discuss this in more detail in the next section.

3.2 Objective Function and Its Validity with Dijkstra’s Algorithm

The classic Dijkstra algorithm finds the shortest path by summing up the edge weights along the path. Applying this directly to our setting would yield unsuitable tracks: instead of following a long path with many likely edges, the tracking would rather choose an unlikely edge to an immediate sink.

Hence, we use a measure assessing the average edge weight along a path. The goal is to find the path through a given node that has the smallest normalized squared sum of edge weights d _i:

(3)

The purpose of this section is to demonstrate that the Dijkstra algorithm can be used to solve for this objective. To do so, let us define an objective function f which assigns a non-negative score to a path satisfying the following condition:

Condition 1

Consider two paths and with . We require the objective function to maintain this relationship after adding an edge e:

(4)

Dijkstra’s algorithm can only be used to solve for an objective if this condition is fulfilled, since it allows to incrementally build a solution, which is the essential cornerstone of Dijkstra’s algorithm.

The objective function used in the classic Dijkstra’s shortest path algorithm is the sum of weights of all edges in a path . This function trivially satisfies the above condition. A non-satisfying objective function is the standard deviation of the weights as shown in Fig. 2. Thus we can see that not all objective functions that determine the quality of a path can be used with Dijkstra’s algorithm.

Regarding the objective function (3) we note that it can be solved with Dijkstra’s algorithm if holds, i.e., the two paths are of equal length. This keeps the denominator of (3) equal, and the numerators are just a sum of values consistent with Condition 1. The condition always holds true in our setting, since edges connect two consecutive time steps only and we start Dijkstra’s algorithm at a particular source, which keeps all considered paths at equal length. Hence, Dijkstra’s algorithm can be used to solve (3).

3.3 Algorithm for Finding All Paths

Tracking of a single node in the DAG is done by finding the shortest path through that node from any source to any sink of the DAG. It may be that the shortest path through other nodes coincides with . This is illustrated in Fig. 3. Hence, to find the shortest paths through all nodes, running a naïve Dijkstra for every node independently will be expensive and redundant.

Instead, we run Dijkstra’s algorithm for every source and sink (in a joint fashion in two passes, see below), record the gathered information at every node, and stitch this information together to obtain the shortest path for every node.

To facilitate this, we define a function to incrementally compute the objective function in (3). We denote this new function by the symbol ⊕ and call it the incremental path operator. The incremental path operator takes as input the objective value for path and a connecting node n and computes the global measure for path . If the weight of the connecting edge between and n is given by d, ⊕ is defined as follows

(5)

Furthermore, all nodes are topologically sorted. That is, for a node n _p,i at timestep t = p and another node n _q,j at timestep t = q, node n _p,i occurs before node n _q,j in the sorted order if p < q.

Our algorithm works as follows. We make two passes through this list of sorted nodes. One in the sorted order (past time steps to future time steps) and one in the reverse sorted order. During the first pass, at every node, the best path from every reachable source to that given node is recorded. This is done by checking all incoming edges to that node and incrementally calculating the best path from all incoming edges from a single source. This becomes possible because all nodes connected to the incoming edges have already been processed earlier (they live at the previous time step). Consider a node n _i with some incoming edges as illustrated in Fig. 4. The best score from any given source to n _i is calculated using:

(6)

Algorithm 1 shows the pseudo-code for the first pass described above. The second pass is equivalent to the first, but operates on the DAG with edges reversed. We record the best path to every reachable sink now. This is done by checking for outgoing edges and the sinks that they lead to. The best score to any given sink from n _i is calculated using:

(7)

Let the set of all reachable sources to node n _i be S _i and the set of all reachable sinks be called K _i. After the two passes are complete, the combined best path for every node is calculated by choosing the paths from the source-sink pair which minimizes the objective function on the combined path as follows:

(8)

Algorithm 2 shows the pseudo-code to obtain all best paths. It can be observed that, if for any given node n _i, the best source-sink pair is given by (s _i, k _i) and the extracted best path is , all nodes lying on this path, having the same best source-sink pair (s _i, k _i), will trace out the exact same path. Hence, while determining unique paths in our solution, we can avoid tracing paths from all such nodes. See Fig. 4 for an illustration.

After all nodes have been examined, we are left with the set of best paths passing through every single node in the DAG. An illustration of the output is shown in Fig. 5.

Algorithm 1: Algorithm to find the associations for the best routes to any node from all reachable sources. The best routes to all reachable sinks are determined by running the same algorithm with the nodes sorted in reverse order

Algorithm 2: Algorithm to find shortest paths via every node

3.4 Complexity Analysis

Let us assume, without loss of generality, that the average number of features in every timestep is n. For t timesteps, we would then have a total of tn nodes in the entire DAG. The number of edges is bounded by n ² between every pair of successive timesteps, so the total number of edges would be bounded by tn ². The naïve version of the algorithm is a combination of two simple Dijkstra runs from any given node to all reachable sources and sinks. As we know the runtime of this algorithm is O(V + E), for V vertices and E edges in a graph, the runtime in the naïve case will be O(tn + tn ²) or O(tn ²) for every node. Hence, if we were to run the naïve algorithm for all nodes, the runtime would be given by O(t ² n ³) in the worst case.

Now for our improved algorithm, assuming the number of sources/sinks is given by p, we can safely say that p << tn. The runtime of Algorithm 1 is then given by O(tnp + tn ² p) or O(tn ² p). For Algorithm 2, it is O(tnp ² + t ² n). So the total runtime of our algorithm would be given by O(tn ² p + tnp ² + t ² n) which in practice (as seen in Table 1) is far less than O(t ² n ³).

Table 1 Computation runtimes and memory requirements of our algorithm versus the naïve one for several data sets

Full size table

The memory footprint for the naïve version is bounded by the normal Dijkstra runtime of O(N) for N nodes. Thus, in our scenario, it is given by O(tn) as the shortest path via every node is computed independently. For the improved algorithm however, we need to store the mappings of shortest paths from all incoming/outgoing edges to all reachable sources/sinks and hence the memory footprint is given by O(tnp).

3.5 Filtering Similar Paths for Visualization

For visualization purposes we need to choose the best candidate paths which best represent a feature track at some spatio-temporal region.

In most cases, due to slight perturbations in the DAG, two unique paths may differ only at very few node positions with most of their nodes being identical. An example of this can be observed in Fig. 5, where the blue and green paths show in essence the same structure with only a slight perturbation.

We aim to show the path with the best objective score, while other similar paths falling within a specified threshold are filtered out. The similarity g between two paths is estimated using

(9)

where represent the number of matching edges. The function g estimates the fraction of edges that are identical in both paths. The filtration using function g is applied as follows. All paths obtained by solving Eq. (8) for every node are sorted according to the best objective function score given by Eq. (3). Paths are then processed in this sorted order, from lowest score to highest. If a path falls above the similarity threshold with any other path encountered before, it is filtered out. All other paths are retained.

If the filter node is set to be 100%, we are left with the complete set of unique paths. In our experiments, a filter rate of 70% shows best results.

4 Results

The timing and memory consumption for our method are given in Table 1. Regarding the computation times, note how our algorithm improves over the naïve version by up to two orders of magnitude. Regarding the memory consumption, the naïve method has lower memory usage as it only processes one node at once, while our algorithm processes all nodes together. Hence, considering the number of nodes in each data set, our algorithm is quite efficient with regards to memory usage as well.

Figure 6 shows a rotating and translating benzene data set. Since the data is not truly time-dependent, but just transformed rigidly, this serves as a test case to show that we capture all expected tracks and that our method is invariant against rotations and translations.

Figure 7 shows the 2D time-dependent Streak Line Curvature dataset.

Figure 8 shows the tracks for the smallest super/sub level set regions in a 2D Checkerboard dataset. The checkerboard pattern starts off smoothly and becomes increasingly noisy with time.

Figure 9 shows all the tracks in a flow around a 3D Square Cylinder. The location of the center of mass of a region is used to visualize the paths in all result images.

5 Conclusion

We presented an extension of the method in [12] which was used to extract the best track through a chosen region at any given timestep in a time-dependent scalar field. These regions are based on topological segmentations in the spatial domain using merge trees and form the nodes in a Directed Acyclic Graph (DAG) structure in the spatio-temporal domain. Using the method in [12] to extract the best tracks through all nodes naïvely results in tracing the same paths multiple times. The algorithm presented in this paper makes use of the structure in the DAG to iteratively compute the best paths to every node from all reachable sources and sinks. This in turn allows us to compute the best paths through all nodes at orders of magnitude faster than the naïve approach. We also presented a filtering algorithm to filter out very similar paths for visualizing all paths together. Further work may include clustering these paths according to their similarity by using temporal similarity estimation techniques like dynamic time warping.

References

Bauer, D., Peikert, R.: Vortex tracking in scale-space. In: Proceedings of the Symposium on Data Visualisation 2002 (VISSYM ’02), pp. 233–ff. Eurographics Association, Switzerland (2002)
Google Scholar
Bremer, P.T., Weber, G., Tierny, J., Pascucci, V., Day, M., Bell, J.: Interactive exploration and analysis of large-scale simulations using topology-based data segmentation. IEEE Trans. Vis. Comput. Graph. 17(9), 1307–1324 (2011)
Article Google Scholar
Bujack, R., Hotz, I., Scheuermann, G., Hitzer, E.: Moment invariants for 2d flow fields using normalization. In: Proceedings of 2014 IEEE Pacific Visualization Symposium, pp. 41–48 (2014)
Google Scholar
Garth, C., Tricoche, X., Scheuermann, G.: Tracking of vector field singularities in unstructured 3d time-dependent datasets. In: Proceedings of the Conference on Visualization ’04 (VIS ’04), pp. 329–336. IEEE Computer Society, Washington, (2004)
Google Scholar
Günther, T., Theisel, H.: Vortex cores of inertial particles. IEEE Trans. Vis. Comput. Graph. 20(12), 2535–2544 (2014)
Article Google Scholar
Günther, T., Schulze, M., Theisel, H.: Rotation invariant vortices for flow visualization. IEEE Trans. Vis. Comput. Graph. 22(1), 817–826 (2016)
Article Google Scholar
Kettner, L., Rossignac, J., Snoeyink, J.: The safari interface for visualizing time-dependent volume data using iso-surfaces and contour spectra. Comput. Geom. 25(1), 97–116 (2003). European Workshop on Computational Geometry—CG01
Google Scholar
Oesterling, P., Heine, C., Weber, G.H., Morozov, D., Scheuermann, G.: Computing and visualizing time-varying merge trees for high-dimensional data. In: Carr, H., Garth, C., Weinkauf, T. (eds.) Topological Methods in Data Analysis and Visualization IV, pp. 87–101. Springer, Cham (2017)
Chapter Google Scholar
Pele, O., Werman, M.: The quadratic-chi histogram distance family. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) Computer Vision—ECCV 2010, pp. 749–762. Springer, Berlin (2010)
Chapter Google Scholar
Reinders, F., Sadarjoen, I.A., Vrolijk, B., Post, F.H.: Vortex tracking and visualisation in a flow past a tapered cylinder. Comput. Graphics Forum 21(4), 675–682 (2002)
Article Google Scholar
Reininghaus, J., Kasten, J., Weinkauf, T., Hotz, I.: Efficient computation of Combinatorial Feature Flow Fields. IEEE Trans. Vis. Comput. Graph. 18(9), 1563–1573 (2012)
Article Google Scholar
Saikia, H., Weinkauf, T.: Global feature tracking and similarity estimation in time-dependent scalar fields. Comput. Graphics Forum 36(3), 1–11 (2017)
Article Google Scholar
Saikia, H., Seidel, H.P., Weinkauf, T.: Extended branch decomposition graphs: structural comparison of scalar data. Comput. Graphics Forum (Proc. EuroVis) 33(3), 41–50 (2014)
Google Scholar
Saikia, H., Seidel, H.P., Weinkauf, T.: Fast similarity search in scalar fields using merging histograms. In: Carr, H., Garth, C., Weinkauf, T. (eds.) TopoInVis, pp. 1–14. Annweiler, Germany (2015)
Google Scholar
Samtaney, R., Silver, D., Zabusky, N., Cao, J.: Visualizing features and tracking their evolution. Computer 27(7), 20–27 (1994)
Article Google Scholar
Sohn, B.S., Bajaj, C.: Time-varying contour topology. IEEE Trans. Vis. Comput. Graph. 12(1), 14–25 (2006)
Article Google Scholar
Szymczak, A.: Subdomain aware contour trees and contour evolution in time-dependent scalar fields. In: International Conference on Shape Modeling and Applications 2005 (SMI’ 05), pp. 136–144 (2005)
Google Scholar
Theisel, H., Seidel, H.P.: Feature flow fields. In: Proceedings of the Symposium on Data Visualisation 2003 (VISSYM ’03), pp. 141–148. Eurographics Association, Switzerland, (2003)
Google Scholar
Theisel, H., Sahner, J., Weinkauf, T., Hege, H., Seidel, H..: Extraction of parallel vector surfaces in 3d time-dependent fields and application to vortex core line tracking. In: IEEE Visualization, 2005 (VIS 05), pp. 631–638 (2005)
Google Scholar
Thomas, D.M., Natarajan, V.: Symmetry in scalar field topology. IEEE Trans. Vis. Comput. Graph. 17(12), 2035–2044 (2011)
Article Google Scholar
Thomas, D.M., Natarajan, V.: Detecting symmetry in scalar fields using augmented extremum graphs. IEEE Trans. Vis. Comput. Graph. 19(12), 2663–2672 (2013)
Article Google Scholar
Thomas, D., Natarajan, V.: Multiscale symmetry detection in scalar fields by clustering contours. IEEE Trans. Vis. Comput. Graph. 20(12), 2427–2436 (2014)
Article Google Scholar
Tricoche, X., Wischgoll, T., Scheuermann, G., Hagen, H.: Topology tracking for the visualization of time-dependent two-dimensional flows. Comput. Graph. 26(2), 249–257 (2002)
Article Google Scholar
Wang, Z., Seidel, H.P., Weinkauf, T.: Multi-field pattern matching based on sparse feature sampling. IEEE Trans. Vis. Comput. Graph. (Proc. IEEE VIS) 22(1), 807–816 (2016)
Google Scholar
Weinkauf, T., Sahner, J., Theisel, H., Hege, H.C.: Cores of swirling particle motion in unsteady flows. IEEE Trans. Vis. Comput. Graph. (Proc. IEEE Visualization) 13(6), 1759–1766 (2007)
Google Scholar
Weinkauf, T., Theisel, H., Gelder, A.V., Pang, A.: Stable feature flow fields. IEEE Trans. Vis. Comput. Graph. 17(6), 770–780 (2011)
Article Google Scholar
Weinkauf, T., Hege, H.C., Theisel, H.: Advected tangent curves: a general scheme for characteristic curves of flow fields. Comput. Graph. Forum (Proc. Eurographics) 31(2), 825–834 (2012). Eurographics 2012, Cagliari, Italy, May 13–18
Google Scholar

Download references

Author information

Authors and Affiliations

KTH Royal Institute of Technology, Stockholm, Sweden
Himangshu Saikia & Tino Weinkauf

Authors

Himangshu Saikia
View author publications
You can also search for this author in PubMed Google Scholar
Tino Weinkauf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Himangshu Saikia .

Editor information

Editors and Affiliations

University of Leeds, Leeds, UK
Hamish Carr
Keio University, Yokohama, Kanagawa, Japan
Issei Fujishiro
Heidelberg University, IWR, Heidelberg, Germany
Filip Sadlo
University of Aizu, Aizu-Wakamatsu City, Fukushima, Japan
Shigeo Takahashi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saikia, H., Weinkauf, T. (2020). Fast Topology-Based Feature Tracking using a Directed Acyclic Graph. In: Carr, H., Fujishiro, I., Sadlo, F., Takahashi, S. (eds) Topological Methods in Data Analysis and Visualization V. TopoInVis 2017. Mathematics and Visualization. Springer, Cham. https://doi.org/10.1007/978-3-030-43036-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-43036-8_10
Published: 11 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43035-1
Online ISBN: 978-3-030-43036-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics