Discrete Optimization for Optical Flow

Menze, Moritz; Heipke, Christian; Geiger, Andreas

doi:10.1007/978-3-319-24947-6_2

Moritz Menze¹⁷,
Christian Heipke¹⁷ &
Andreas Geiger¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9358))

Included in the following conference series:

German Conference on Pattern Recognition

2771 Accesses
59 Citations

Abstract

We propose to look at large-displacement optical flow from a discrete point of view. Motivated by the observation that sub-pixel accuracy is easily obtained given pixel-accurate optical flow, we conjecture that computing the integral part is the hardest piece of the problem. Consequently, we formulate optical flow estimation as a discrete inference problem in a conditional random field, followed by sub-pixel refinement. Naïve discretization of the 2D flow space, however, is intractable due to the resulting size of the label set. In this paper, we therefore investigate three different strategies, each able to reduce computation and memory demands by several orders of magnitude. Their combination allows us to estimate large-displacement optical flow both accurately and efficiently and demonstrates the potential of discrete optimization for optical flow. We obtain state-of-the-art performance on MPI Sintel and KITTI.

Access provided by Autonomous University of Puebla. Download conference paper PDF

FALDOI: A New Minimization Strategy for Large Displacement Variational Optical Flow

Article 15 November 2016

Fast Optical Flow Using Dense Inverse Search

Optical Flow with Geometric Occlusion Estimation and Fusion of Multiple Frames

1 Introduction

Estimating dense optical flow is a fundamental problem in computer vision. Despite significant progress over the last decades, realistic scenes with displacements of several hundred pixels, strong changes in illumination and textureless or specular regions remain challenging to date [9, 17]. Traditionally, dense optical flow estimation has been formulated as a continuous optimization problem [20, 25], and many of today’s most successful methods leverage elaborate variants of the original formulation, allowing for more robust penalties or improving optimization. As continuous methods typically require linearizing the highly non-convex data term, they only permit the estimation of very small displacements up to a few pixels. Thus, in order to handle large displacements in real-world videos, a simple heuristic is often employed: Optical flow is estimated in a coarse-to-fine manner, thereby guaranteeing an upper bound to the maximal displacement at each level of the image pyramid. Unfortunately, this strategy is highly susceptible to local minima as small structures and textural details vanish at coarse image resolutions, leading to oversmoothing artifacts in the estimated flow field.

In contrast to optical flow, the most successful approaches to stereo matching typically rely on discrete inference in graphical models. While such models are loopy by nature and thus lead to NP-hard optimization problems, good approximate solutions can often be efficiently computed using graph cuts, belief propagation or mean field approximations. Importantly, no image pyramids are required as the full data cost volume is considered at the same time during inference. Unfortunately, the application of discrete methods to the problem of optical flow is not straightforward and hence there exists only relatively little work in this direction. The main reason for this is the huge size of the label space which needs to be considered for the 2D large-displacement optical flow problem as opposed to the 1D stereo problem.

In this paper, we propose three different strategies to make discrete inference in a pairwise conditional random field (CRF) applicable to the estimation of optical flow, see Fig. 1 for an illustration. First, we restrict the label set by considering only the L most likely matches per pixel which we obtain via approximate nearest neighbor search in feature space subject to non-maxima suppression constraints. To validate this restriction, we experimentally show that the oracle solution of the restricted set outperforms all existing optical flow techniques by a significant margin. Second, our inference scheme takes advantage of efficient convergent block coordinate descent (BCD) and iteratively updates all image rows and columns conditioned on the remaining variables via dynamic programming. Third, we exploit the special form of the pairwise potentials used by our formulation to further decrease computational complexity, thereby making very large unordered label sets with hundreds of labels tractable. Upon convergence, we remove outliers (e.g., in occluded regions) using strategies borrowed from the stereo literature. Finally, we regress a real-valued dense flow field from our semi-dense integer flow estimate using variational techniques [32, 42]. We experimentally validate the proposed method on two challenging benchmarks: MPI Sintel [9] and KITTI [17]. Our experiments show that the proposed method attains state-of-the-art performance. Importantly, our results indicate that discrete optimization can be a powerful tool for solving optical flow problems - even when considering pairwise flow priors only. Our code and supplementary material are available from our project page: http://www.cvlibs.net/projects/discrete_flow.

2 Related Work

Global estimation of optical flow has traditionally been formulated as a continuous variational optimization problem with linearized data terms [4, 20] and many of the most successful works still follow the same paradigm to date [8, 12, 30, 34, 35, 37, 40, 43]. To estimate displacements larger than 1 pixel, as required by modern benchmarks [1, 9, 17] and for many applications, continuous methods typically rely on image pyramids [7]. Unfortunately, this heuristic is prone to local minima as texture details and fine structures vanish at small scales. As a consequence, methods leveraging sparse feature correspondences to guide the optimization process have been popularized [5, 6, 32, 38, 42], incorporating feature matches into initialization or as an additional data term. Our approach shares similarity with these methods in the sense that we also refine our integer-valued flow result to sub-pixel accuracy as proposed in [32, 42]. However, in contrast to the above-mentioned works, our integral matches are obtained via discrete optimization with optical flow priors. This allows our algorithm to establish denser correspondences than possible with independently matched sparse features and, in combination with sub-pixel refinement, leads to better results.

Alternatively, optical flow can be viewed as a discrete optimization problem. In the absence of regularization constraints, this corresponds to a complete search of the discretized 2D flow space which has to be carried out for each pixel [2, 47]. While local support windows [33] alleviate the effect of border bleeding, they generally cannot compete with methods that leverage global regularization. Incorporating smoothness constraints, however, is much more difficult for discrete optical flow than for the related stereo matching problem due to the extremely large label space of the discretized 2D flow field. To avoid this difficulty, a number of approaches formulate optical flow as a segmentation problem: Based on a small set of dense flow field proposals, the most likely flow field at each pixel subject to regularization constraints is approximated via quadratic pseudo-boolean optimization (QPBO) or belief propagation (BP) [11, 23, 44, 45, 48]. A reduced label set for all pixels is proposed by [28] restricting the method to scenes with little and non-complex motion. In contrast, here we pursue a more direct approach to discrete optical flow which does not rely on proposed flow fields or a global label set but allows for an arbitrary set of flow proposals per pixel. Compared to the approaches of Steinbrücker et al. [36] and Liu et al. [24], we neither require a complete search nor image pyramids [24]. Our approach is also related to particle-based methods such as particle BP [46] or PatchMatch BP [3, 21]. In contrast to those methods, our algorithm is not restricted to a relatively small set of particles or local resampling which can lead to local minima. Instead, we maintain diverse distributions over much larger state spaces throughout the whole inference process.

Very recently, object recognition and feature learning have been exploited as a powerful source of information for optical flow. Notable examples are the data-driven flow transfer method of Wei et al. [41] and FlowNets based on deep neural networks by Fischer et al. [15]. In this paper our focus lies on a more generic model without the need for large annotated training sets. However, these ideas could be easily incorporated into the proposed model, e.g., via additional unary terms, promising further gains in performance in the future.

3 Model

Let us assume two input images $\mathbf {I}$ and $\mathbf {I}'$ of size $W\times H$. Our goal is to assign as many pixels as possible from the reference image $\mathbf {I}$ to pixels in the target image $\mathbf {I}'$. In other words, we aim for a semi-dense integer-valued flow field, establishing correspondences which are visible in both views. Flow vectors are selected from a diverse set of proposals (Sect. 3.1) based on a CRF model (Sect. 3.2) which is efficiently optimized (Sect. 3.3). Interpolation, extrapolation and sub-pixel refinement are then addressed in a subsequent post-processing step (Sect. 3.4).

3.1 Diverse Flow Proposals

Unfortunately, naïve pixel-wise discretization of all possible flow vectors would lead to an intractably large set of labels at each pixel. Consider for instance a flow range of $\pm 250$ pixels in u- and v-direction. While this range is sufficiently large for the datasets tackled in this paper (see supplementary material), it leads to more than 60, 000 labels per pixel for which even the data term alone would be challenging to calculate and to store. In this section, we therefore propose a mechanism which extracts a subset of L flow proposals ($L=500$ in our experiments) while maintaining a high recall of ground truth optical flow.

Towards this goal, we partition the target image into cells of equal size as illustrated in Fig. 1 (left). For each cell, we apply the randomized k-d tree algorithm [29] on the feature descriptors^{Footnote 1} of the pixels falling into that cell, yielding an efficient approximate search structure. Next, for each pixel in the reference image (red node in Fig. 1), we find all relevant cells in the target image according to the desired optical flow range (red shaded cells in Fig. 1) and concatenate all M flow vectors corresponding to the K nearest neighbor matches from the k-d tree of each cell (green nodes in Fig. 1). In contrast to a single search structure constructed for the entire target image, this strategy has two advantages: First, arbitrary optical flow search ranges can be implemented with little computational overhead by searching only in relevant cells. Second, the retrieved flow vectors respect an approximate non-maximal-suppression constraint, i.e., we obtain exactly K flow vectors per cell.

As neighboring pixels often exhibit similar optical flow, we additionally sample N random pixels (blue nodes in Fig. 1) from a local Gaussian distribution centered at the reference pixel and add the respective winner-takes-all solution (i.e., the flow vector corresponding to the best match at the sampled pixel) to the proposal set of the current pixel. In case it is already present in the proposal set, we proceed with the next best flow vector, ensuring that the final set of $L=M+N$ flow vectors is unique. In our experiments we use $M=300$ nearest neighbors and $N=200$ proposals from neighboring pixels.

3.2 Random Field Model

We associate a discrete label $l_\mathbf {p}\in \{1,\dots ,L\}$ with each pixel $\mathbf {p}=(x,y)$ in the reference image, corresponding to the (unique) integer-valued proposal flow vectors $\mathbf {f}_\mathbf {p}(l_\mathbf {p})\in \mathbb {Z}^2$ described in Sect. 3.1. We consider optical flow estimation as MAP inference in a pairwise CRF. More specifically, we aim at minimizing

$$\begin{aligned} E(\mathbf {l}) = \lambda \sum _{\mathbf {p}\in \mathcal {P}} \underbrace{\varphi _\mathbf {p}(l_\mathbf {p})}_{\text {data}} \,+\, \sum _{\mathbf {p}\sim \mathbf {q}} \underbrace{\psi _{\mathbf {p},\mathbf {q}}(l_\mathbf {p},l_\mathbf {q})}_{\text {smoothness}} \end{aligned}$$

(1)

with respect to the set of image labels $\mathbf {l}=\{l_\mathbf {p}|\mathbf {p}\in \mathcal {P}\}$ where $\mathcal {P}$ denotes the set of pixels in image $\mathbf {I}$. Here, we have dropped the dependency on the input images $\mathbf {I}$ and $\mathbf {I}'$ for clarity and $\sim $ denotes all neighbors on a 4-connected image grid. The relative weight between the data term (measuring data fidelity) and the smoothness term (encouraging smooth flow fields) is defined by $\lambda $.

The data term $\varphi _\mathbf {p}(l_\mathbf {p})$ encodes the cost at pixel $\mathbf {p}$ given label $l_\mathbf {p}$. We model it as the truncated $\ell _1$-penalty evaluating the difference between feature descriptors

$$\begin{aligned} \varphi _\mathbf {p}(l_\mathbf {p}) = \min \left( {\Vert \mathbf {d}_\mathbf {p}- \mathbf {d}'_\mathbf {p}(l_\mathbf {p}) \Vert }_1,\tau _\varphi \right) \end{aligned}$$

(2)

where $\mathbf {d}_\mathbf {p}\in \mathbb {R}^D$ denotes the feature descriptor at pixel $\mathbf {p}$ in the reference image, $\mathbf {d}'_\mathbf {p}(l_\mathbf {p})\in \mathbb {R}^D$ denotes the feature descriptor associated with pixel $\mathbf {p}$ and label $l_\mathbf {p}$ in the target image, and $\tau _\varphi $ is the truncation threshold of the data term.

The second term in Eq. 1 encourages smooth flow fields and is modeled as the weighted truncated $\ell _1$-distance of neighboring optical flow vectors

$$\begin{aligned} \psi _{\mathbf {p},\mathbf {q}}(l_\mathbf {p},l_\mathbf {q}) = w_{\mathbf {p},\mathbf {q}} ~ \min \left( {\Vert \mathbf {f}_\mathbf {p}(l_\mathbf {p}) - \mathbf {f}_\mathbf {q}(l_\mathbf {q}) \Vert }_1,\tau _\psi \right) \end{aligned}$$

(3)

where $\mathbf {f}_\mathbf {p}(l_\mathbf {p})\in \mathbb {R}^2$ denotes the flow vector at pixel $\mathbf {p}$ associated with label $l_\mathbf {p}$ and $\tau _\psi $ is the truncation threshold of the smoothness term. We remark that a truncated penalty is not only robust against outliers and thus preferable from a statistical point of view [34], but it is also critical for tractable inference in our model as described in Sect. 3.3. Moreover, note that Eq. 3 is a generalization of the pairwise potential proposed for stereo matching by Hirschmüller [19] to arbitrary truncation thresholds $\tau _\psi $. As we expect flow boundaries to coincide with image edges, we additionally weight the smoothness term by a weight factor $w_{\mathbf {p},\mathbf {q}}=\exp \left( -\alpha \,\kappa _{\mathbf {p},\mathbf {q}}^2\right) $. Here, $\kappa _{\mathbf {p},\mathbf {q}} \in \left[ 0,1\right] $ measures the strength of the edge between pixel $\mathbf {p}$ and pixel $\mathbf {q}$. We calculate $\kappa _{\mathbf {p},\mathbf {q}}$ using structured edge detection [13].

3.3 Inference

Despite the simplicity of the model described in Sect. 3.2 and the label set of size $L=500$, inference using max-product loopy BP is still prohibitively slow. To enable efficient inference, we therefore exploit two additional strategies. First, instead of loopy belief propagation, we perform block coordinate descent (BCD). More specifically, we follow [10] and iteratively update alternating image rows and columns conditioned on the MAP solution of the remaining variables as illustrated in Fig. 1 (middle). Second, we exploit the truncated form of the pairwise potentials in Eq. 3. This is illustrated in Fig. 1 (right). In combination, both steps reduce computational complexity by about four orders of magnitude.

Without loss of generality, consider the optimization of image row y. The naïve dynamic programming algorithm recursively fills the cumulative cost matrix $\mathbf {C}$ for each x from 1 to W using the following update equation:

$$\begin{aligned} \mathbf {C}(x,l)= & {} \lambda \, \varphi _{(x,y)}(l) + \psi _{(x,y),(x,y-1)}(l,l^*_{x,y-1}) + \psi _{(x,y),(x,y+1)}(l,l^*_{x,y+1})\nonumber \\+ & {} \min _{0\le k<L} \left( \psi _{(x,y),(x-1,y)}(l,k) + \mathbf {C}(x-1,k) \right) \end{aligned}$$

(4)

Here, $l^*_\mathbf {p}$ denotes the assignment of the fixed variables, i.e., the variables outside row y. While the global problem is NP-hard and can be solved only approximately, each sub-problem corresponds to a chain MRF for which we obtain an optimal solution via dynamic programming (backtracking). This leads to the desirable property that this algorithm (unlike loopy BP) is guaranteed to converge.

In case of ordered label sets (e.g., in stereo matching [14] or depth reconstruction [10]) the efficient distance transform [14] can be employed to lower the complexity of this algorithm from $O(WL^2)$ to O(WL) by calculating the expression in the last row of Eq. 4 in linear time. Unfortunately, in our case the flow vectors involved in $\psi _{(x,y),(x-1,y)}(l,k)$ are sparse and unordered, therefore prohibiting the application of this trick. However, we are still able to utilize the fact that our pairwise potentials $\psi $ are truncated at $\tau _\psi $: First, note that for practical truncation thresholds (i.e., $\tau _\psi <15$), the majority of the $L^2$ pairwise terms evaluates to $\tau _\psi $. We exploit this observation by partitioning the labels of each neighboring pixel pair $(\mathbf {p},\mathbf {q})$ into sets

$$\begin{aligned} \mathcal {K}_{\mathbf {p},\mathbf {q},l} = \{k\in \{1,\dots ,L\}\,|\,{\Vert \mathbf {f}_\mathbf {p}(l) - \mathbf {f}_\mathbf {q}(k) \Vert }_1<\tau _\psi \} \end{aligned}$$

(5)

which contain all labels k at pixel $\mathbf {q}$ for which the flow $\mathbf {f}_\mathbf {q}(k)$ is within $\tau _\psi $ from the flow $\mathbf {f}_\mathbf {p}(l)$ associated with label l at pixel $\mathbf {p}$. Figure 1 (right) illustrates $\mathcal {K}_{\mathbf {p},\mathbf {q}}$ for a single flow vector at pixel $\mathbf {p}$ (shown in red) and a set of flow vectors at pixel $\mathbf {q}$ (black+gray). Given this definition, the last term in Eq. 4 can be written as

$$\begin{aligned} \min \left( \min _{k \in \mathcal {K}_{(x,y),(x-1,y),l}}~ \left( \psi _{(x,y),(x-1,y)}(l,k) + \mathbf {C}(x-1,k) \right) \!,c \right) \end{aligned}$$

(6)

where the constant c is given by

$$\begin{aligned} c = \min _{0\le k <L} \left( w_{(x,y),(x-1,y)} \tau _\psi + \mathbf {C}(x-1,k) \right) \end{aligned}$$

(7)

As c does not depend on l and can be calculated in O(L), Eq. 6 can be evaluated (for all l) in $O(\sum _l|\mathcal {K}_{\mathbf {p},\mathbf {q},l}|)$ instead of $O(L^2)$, where $|\mathcal {K}_{\mathbf {p},\mathbf {q},l}| \ll L$ in practice due to the diversity of the proposal set as evidenced by our experimental evaluation in Sect. 4. It is important to note that the sets $\mathcal {K}_{\mathbf {p},\mathbf {q},l}$ can be pre-computed and reused during all BCD iterations. In our implementation, we further accelerate this pre-computation step using hash maps for efficiently retrieving flow vectors.

3.4 Postprocessing

The inference algorithm described above solves the correspondence problem, i.e., it assigns each pixel in the reference image to a pixel in the target image. As we do not model occlusions explicitly, also occluded pixels and pixels leaving the image domain are assigned a label. We therefore remove outliers from our result, borrowing ideas from the stereo matching literature [19]. More specifically, we perform forward-backward consistency checking, i.e., we calculate the optical flow forwards and backwards in time and retain only flow vectors which are consistent. In addition, we remove small isolated segments which often correspond to wrong optical flow estimates using connected component labeling with a minimum segment size of 100 px and a flow consistency threshold of 10 px. Finally, in order to obtain sub-pixel flow values and to inter-/extrapolate into unmatched regions, we refine our results using the state-of-the-art flow refinement techniques of DeepFlow and EpicFlow.

Table 1. Pilot Study on MPI Sintel and KITTI Training Sets. See text.

Full size table

4 Experimental Evaluation

We evaluate our approach on the challenging MPI Sintel [9] and KITTI [17] datasets. Our evaluation starts with a pilot study where we experimentally validate the claims from the introduction, followed by a quantitative and qualitative evaluation of our model. The parameters in our model are set via block coordinate descent on the respective training set (see supplementary material). To further increase efficiency we use a stride of 4 pixels. For subpixel interpolation of our results, we leverage the interpolation stages of DeepFlow (“Ours+DeepFlow”) and EpicFlow (“Ours+EpicFlow”). On average, our MATLAB/C++ implementation requires about 3 min for one 0.5 megapixel color image pair from the KITTI training set (10 % descriptor extraction, 40 % proposal generation, 35 % BCD, 15 % postprocessing and overhead).

Table 2. Evaluation on MPI Sintel Test Set. This table lists the top 10 out of 44 ranked methods on the MPI Sintel flow benchmark in terms of endpoint error (EPE) in all, matched (Noc) and unmatched (Occ) regions. The results on the clean images are shown in (a). The results on the final set are shown in (b).

Full size table

4.1 Pilot Study

Table 1 shows results on the non-occluded parts of the MPI Sintel and KITTI training sets in terms of endpoint error (EPE) and outliers (threshold: 3 px). We show (from top-to-bottom) the results of DeepFlow [42], EpicFlow [32], our results in combination with the refinement stages of DeepFlow [42] and EpicFlow [32], as well as an “Oracle” which refers to the flow map obtained by selecting the flow with the smallest EPE at each pixel from our proposal set. Note that our approach uses only the refinement stages of [32, 42] and not the DeepMatches (DM) which form the foundation for [32, 42]. As evidenced by this experiment, our method obtains state-of-the-art performance, outperforming [32, 42] in most of the error metrics. Furthermore, our proposal set contains flow vectors close to the ground truth for most of the pixels while being diverse enough to avoid local minima and reduce the computational complexity of the pairwise terms. Figure 2 shows histograms of the frequency of neighboring flow proposals with respect to the distance of their endpoints for $L=500$. While the pairwise term in Eq. 3 can take $L^2=250,000$ states in total, only $2\,\%$ of them fall below the truncation value of 15 px and need to be evaluated in Eq. 6. Thus, pre-calculating the partitions $\mathcal {K}_{\mathbf {p},\mathbf {q},l}$ saves almost two orders of magnitude in computation time.

Table 3. Evaluation on KITTI Test Set. This table lists the top 15 out of 57 ranked methods on the KITTI optical flow benchmark in terms of outliers and endpoint error (EPE) in non-occluded and all regions. For comparability, only pure optical flow methods are shown, excluding motion stereo methods and techniques which use stereo information or more than two frames as input.

Full size table

4.2 Quantitative Results

We also compare our method to the state-of-the-art on the challenging held-out test data of MPI Sintel [9] and KITTI [17]. As requested by the evaluation protocol, we only submitted our strongest entry (“Ours+EpicFlow”) for evaluation on the test set. Tables 2 and 3 compare our results to the state-of-the-art in two-frame optical flow. On MPI Sintel we obtain rank 1 on the “clean” set and rank 2 on the “final” set out of 44 ranked methods in total. On KITTI our method ranks 6th (out of 57 submitted results) for an outlier threshold of 3 pixels and 1st when using a threshold of 5 pixels. Note that we have excluded motion stereo methods and methods using additional information from Table 3 for a fair comparison. For full results, we refer to the online leaderboards^{Footnote 2}.

4.3 Qualitative Results

Figure 3 depicts qualitative results of our method on a subset of MPI Sintel (top four images) and KITTI (bottom four images). As we require ground truth for visualizing the error maps we utilize the training sets for this purpose. We show the input image (left), our results in combination with EpicFlow postprocessing (middle) and the color-coded error map with respect to the ground truth flow field (right). At the top of each error map we specify the percentage of outliers and the endpoint error (EPE). The color coding visualizes outliers ($>3$ px EPE) in red and inliers ($<3$ px EPE) in blue on a logarithmic color scale. Pixels without ground truth value are shown in black. Note how our method is able to capture fine details in the optical flow. The last result of each dataset shows a failure case. For MPI Sintel, errors can be mostly attributed to homogeneous regions, occlusions and large changes in motion blur. On KITTI, we identified large perspective distortions, illumination artifacts and saturated or reflective regions as the primary sources of error.

5 Conclusions

We presented a discrete solution to the estimation of optical flow by exploiting three strategies which limit computational and memory requirements. Our experiments show that discrete optimization can lead to state-of-the-art optical flow results using a relatively simple prior model. In the future, we plan to integrate richer optical flow priors into our approach. In particular, we aim at jointly reasoning about several image scales and at incorporating semantic information [18, 41] into optical flow estimation. Furthermore, we plan to extend our approach to the estimation of 3D scene flow [26, 27].

Notes

1.
As feature descriptor, we leverage DAISY [39] due to its computational efficiency and robustness against changes in illumination.
2.
http://sintel.is.tue.mpg.de.
http://www.cvlibs.net/datasets/kitti.

References

Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M., Szeliski, R.: A database and evaluation methodology for optical flow. IJCV 92, 1–31 (2011)
Article Google Scholar
Bao, L., Yang, Q., Jin, H.: Fast edge-preserving PatchMatch for large displacement optical flow. In: CVPR (2014)
Google Scholar
Besse, F., Rother, C., Fitzgibbon, A., Kautz, J.: PMBP: PatchMatch pelief propagation for correspondence field estimation. IJCV 110(1), 2–13 (2014)
Article Google Scholar
Black, M.J., Anandan, P.: A framework for the robust estimation of optical flow. In: ICCV (1993)
Google Scholar
Braux-Zin, J., Dupont, R., Bartoli, A.: A general dense image matching framework combining direct and feature-based costs. In: ICCV (2013)
Google Scholar
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. PAMI 33, 500–513 (2011)
Article Google Scholar
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High Accuracy Optical Flow Estimation Based on a Theory for Warping. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004)
Chapter Google Scholar
Bruhn, A., Weickert, J., Schnörr, C.: Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. IJCV 61(3), 211–231 (2005)
Article Google Scholar
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012)
Chapter Google Scholar
Chen, Q., Koltun, V.: Fast MRF optimization with application to depth reconstruction. In: CVPR (2014)
Google Scholar
Chen, Z., Jin, H., Lin, Z., Cohen, S., Wu, Y.: Large displacement optical flow from nearest neighbor fields. In: CVPR (2013)
Google Scholar
Demetz, O., Stoll, M., Volz, S., Weickert, J., Bruhn, A.: Learning brightness transfer functions for the joint recovery of illumination changes and optical flow. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 455–471. Springer, Heidelberg (2014)
Chapter Google Scholar
Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: ICCV, pp. 1841–1848 (2013)
Google Scholar
Felzenszwalb, P., Huttenlocher, D.: Efficient belief propagation for early vision. IJCV 70(1), 41–54 (2006)
Article Google Scholar
Fischer, P., Dosovitskiy, A., Ilg, E., Häusser, P., Hazirbas, C., Smagt, V.G.P., Cremers, D., Brox, T.: FlowNet: learning optical flow with convolutional networks (2015). arXiv.org 1504.06852
Fortun, D., Bouthemy, P., Kervrann, C.: Aggregation of local parametric candidates and exemplar-based occlusion handling for optical flow. arXiv.org 1407.5759 (2014)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)
Google Scholar
Güney, F., Geiger, A.: Displets: resolving stereo ambiguities using object knowledge. In: CVPR (2015)
Google Scholar
Hirschmüller, H.: Stereo processing by semiglobal matching and mutual information. PAMI 30(2), 328–341 (2008)
Article Google Scholar
Horn, B.K.P., Schunck, B.G.: Determining optical flow. AI 17(1–3), 185–203 (1980)
Google Scholar
Hornáček, M., Besse, F., Kautz, J., Fitzgibbon, A., Rother, C.: Highly overparameterized optical flow using patchmatch belief propagation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part III. LNCS, vol. 8691, pp. 220–234. Springer, Heidelberg (2014)
Chapter Google Scholar
Kennedy, R., Taylor, C.J.: Optical flow with geometric occlusion estimation and fusion of multiple frames. In: Tai, X.-C., Bae, E., Chan, T.F., Lysaker, M. (eds.) EMMCVPR 2015. LNCS, vol. 8932, pp. 364–377. Springer, Heidelberg (2015)
Chapter Google Scholar
Lempitsky, V.S., Roth, S., Rother, C.: Fusionflow: Discrete-continuous optimization for optical flow estimation. In: CVPR (2008)
Google Scholar
Liu, C., Yuen, J., Torralba, A.: SIFT flow: dense correspondence across scenes and its applications. PAMI 33(5), 978–994 (2011)
Article Google Scholar
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI (1981)
Google Scholar
Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: CVPR (2015)
Google Scholar
Menze, M., Heipke, C., Geiger, A.: Joint 3d estimation of vehicles and scene flow. In: ISA (2015)
Google Scholar
Mozerov, M.: Constrained optical flow estimation as a matching problem. TIP 22(5), 2044–2055 (2013)
MathSciNet Google Scholar
Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. PAMI 36(11), 2227–2240 (2014)
Article Google Scholar
Ranftl, R., Bredies, K., Pock, T.: Non-local total generalized variation for optical flow estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 439–454. Springer, Heidelberg (2014)
Chapter Google Scholar
Rashwan, H.A., Mohamed, M.A., García, M.A., Mertsching, B., Puig, D.: Illumination robust optical flow model based on histogram of oriented gradients. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 354–363. Springer, Heidelberg (2013)
Chapter Google Scholar
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: CVPR (2015)
Google Scholar
Rhemann, C., Hosni, A., Bleyer, M., Rother, C., Gelautz, M.: Fast cost-volume filtering for visual correspondence and beyond. In: CVPR (2011)
Google Scholar
Roth, S., Black, M.J.: On the spatial statistics of optical flow. IJCV 74(1), 33–50 (2007)
Article Google Scholar
Sevilla-Lara, L., Sun, D., Learned-Miller, E.G., Black, M.J.: Optical flow estimation with channel constancy. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 423–438. Springer, Heidelberg (2014)
Chapter Google Scholar
Steinbrücker, F., Pock, T., Cremers, D.: Large displacement optical flow computation without warping. In: ICCV, pp. 1609–1614 (2009)
Google Scholar
Sun, D., Roth, S., Black, M.J.: A quantitative analysis of current practices in optical flow estimation and the principles behind them. IJCV 106(2), 115–137 (2013)
Article Google Scholar
Timofte, R., Gool, L.V.: Sparse flow: Sparse matching for small to large displacement optical flow. In: WACV (2015)
Google Scholar
Tola, E., Lepetit, V., Fua, P.: Daisy: an efficient dense descriptor applied to wide baseline stereo. PAMI 32(5), 815–830 (2010)
Article Google Scholar
Vogel, C., Roth, S., Schindler, K.: An evaluation of data costs for optical flow. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 343–353. Springer, Heidelberg (2013)
Chapter Google Scholar
Wei, D., Liu, C., Freeman, W.: A data-driven regularization model for stereo and flow. In: 3DV (2014)
Google Scholar
Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: DeepFlow: Large displacement optical flow with deep matching. In: ICCV (2013)
Google Scholar
Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: BMVC (2009)
Google Scholar
Wulff, J., Black, M.J.: Efficient sparse-to-dense optical flow estimation using a learned basis and layers. In: CVPR (2015)
Google Scholar
Xu, L., Jia, J., Matsushita, Y.: Motion detail preserving optical flow estimation. PAMI 34(9), 1744–1757 (2012)
Article Google Scholar
Yamaguchi, K., McAllester, D., Urtasun, R.: Robust monocular epipolar flow estimation. In: CVPR (2013)
Google Scholar
Yang, H., Lin, W., Lu, J.: DAISY filter flow: A generalized discrete approach to dense correspondences. In: CVPR (2014)
Google Scholar
Yang, J., Li, H.: Dense, accurate optical flow estimation with piecewise parametric model. In: CVPR (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Leibniz Universität Hannover, Hanover, Germany
Moritz Menze & Christian Heipke
Max Planck Institute for Intelligent Systems, Tübingen, Germany
Andreas Geiger

Authors

Moritz Menze
View author publications
You can also search for this author in PubMed Google Scholar
Christian Heipke
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Geiger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moritz Menze .

Editor information

Editors and Affiliations

Institute of Computer Science III, University of Bonn, Bonn, Germany
Juergen Gall
MPI for Intelligent Systems, University of Tübingen, Tübingen, Germany
Peter Gehler
Computer Vision Group, Visual Computing Institute, RWTH Aachen, Aachen, Germany
Bastian Leibe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Menze, M., Heipke, C., Geiger, A. (2015). Discrete Optimization for Optical Flow. In: Gall, J., Gehler, P., Leibe, B. (eds) Pattern Recognition. DAGM 2015. Lecture Notes in Computer Science(), vol 9358. Springer, Cham. https://doi.org/10.1007/978-3-319-24947-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-24947-6_2
Published: 03 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24946-9
Online ISBN: 978-3-319-24947-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Discrete Optimization for Optical Flow

Abstract

Similar content being viewed by others

FALDOI: A New Minimization Strategy for Large Displacement Variational Optical Flow

Fast Optical Flow Using Dense Inverse Search

Optical Flow with Geometric Occlusion Estimation and Fusion of Multiple Frames

1 Introduction

2 Related Work