Improved Rooftop Detection in Aerial Images with Machine Learning

Maloof, M.A.; Langley, P.; Binford, T.O.; Nevatia, R.; Sage, S.

doi:10.1023/A:1025623527461

Improved Rooftop Detection in Aerial Images with Machine Learning

Published: October 2003

Volume 53, pages 157–191, (2003)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Improved Rooftop Detection in Aerial Images with Machine Learning

Download PDF

M.A. Maloof¹,
P. Langley²,
T.O. Binford³,
R. Nevatia⁴ &
…
S. Sage²

1637 Accesses
55 Citations
Explore all metrics

Abstract

In this paper, we examine the use of machine learning to improve a rooftop detection process, one step in a vision system that recognizes buildings in overhead imagery. We review the problem of analyzing aerial images and describe an existing system that detects buildings in such images. We briefly review four algorithms that we selected to improve rooftop detection. The data sets were highly skewed and the cost of mistakes differed between the classes, so we used ROC analysis to evaluate the methods under varying error costs. We report three experiments designed to illuminate facets of applying machine learning to the image analysis task. One investigated learning with all available images to determine the best performing method. Another focused on within-image learning, in which we derived training and testing data from the same image. A final experiment addressed between-image learning, in which training and testing sets came from different images. Results suggest that useful generalization occurred when training and testing on data derived from images differing in location and in aspect. They demonstrate that under most conditions, naive Bayes exceeded the accuracy of other methods and a handcrafted classifier, the solution currently used in the building detection system.

Article PDF

Built environment attributes and crime: an automated machine learning approach

Article Open access 08 July 2020

Development of Remote Sensing Software Using a Boosted Tree Machine Learning Model Architecture for Professional and Citizen Science Applications

Machine Learning Based Urban Change Detection by Fusing High Resolution Aerial Images and Lidar Data

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Aha, D., Kibler, D., & Albert, M. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66.
Google Scholar
Ali, K., Langley, P., Maloof, M., Sage, S.,& Binford, T. (1998). Improving rooftop detection with interactive visual learning. In Proceedings of the Image Understanding Workshop (pp. 479–492). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology, 12, 387–415.
Google Scholar
Beiden, S., Maloof, M., & Wagner, R. (2002). Analysis of competing classifiers in terms of components of variance of ROC summary accuracy measures: Generalization to a population of trainers and a population of testers. In Proceedings of the SPIE International Symposium on Medical Imaging: Image Processing (Vol. 4684).
Beiden, S., Wagner, R., & Campbell, G. (2000). Components-of-variance models and multiple-bootstrap experiments: An alternative method for random-effects Receiver Operating Characteristic analysis. Academic Radiology, 7, 341–349.
Google Scholar
Beymer, D., & Poggio, T. (1996). Image representations for visual learning. Science, 272, 1905–1909.
Google Scholar
Binford, T., Levitt, T., & Mann, W. (1987). Bayesian inference in model-based machine vision. In Proceedings of the Third Annual Conference on Uncertainty in Artificial Intelligence (pp. 73–97). New York, NY: Elsevier Science.
Google Scholar
Blake, C., & Merz, C. (1998). UCI repository of machine learning databases ([http://www.ics.uci.edu/?mlearn/ MLRepository.html]). Department of Information and Computer Sciences, University of California, Irvine.
Google Scholar
Bradley, A. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30:7, 1145–1159.
Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
Google Scholar
Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees. Boca Raton, FL: Chapman & Hall/CRC Press.
Google Scholar
Burl, M., Asker, L., Smyth, P., Fayyad, U., Perona, P., Crumpler, L., & Aubele, J. (1998). Learning to recognize volcanoes on Venus. Machine Learning, 30, 165–194.
Google Scholar
Cardie, C., & Howe, N. (1997). Improving minority class prediction using case-specific feature weights. In Proceedings of the Fourteenth International Conference on Machine Learning (pp. 57–65). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Chan, L., Nasrabadi, N., & Mirelli, V. (1996). Multi-stage target recognition using modular vector quantizers and multilayer perceptrons. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(pp. 114–119). Los Alamitos, CA: IEEE Press.
Google Scholar
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–284.
Google Scholar
Conklin, D. (1993). Transformation-invariant indexing and machine discovery for computer vision. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 10–14). Menlo Park, CA: AAAI Press.
Google Scholar
Connell, J., & Brady, M. (1987). Generating and generalizing models of visual objects. Artificial Intelligence, 31, 159–183.
Google Scholar
Cook, D., Hall, L., Stark, L., & Bowyer, K. (1993). Learning combination of evidence functions in object recognition. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 139–143). Menlo Park, CA: AAAI Press.
Google Scholar
Cromwell, R., & Kak, A. (1991). Automatic generation of object class descriptions using symbolic learning techniques. In Proceedings of the Ninth National Conference on Artificial Intelligence (pp. 710–717). Menlo Park, CA: AAAI Press.
Google Scholar
DeLong, E., DeLong, D., & Clarke-Peterson, D. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 44, 837–845.
Google Scholar
Domingos, P. (1999). MetaCost: A general method for making classifiers cost-sensitive. In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining (pp. 155–164). New York, NY: ACM Press.
Google Scholar
Dorfman, D., & Alf, E. Jr. (1969). Maximum likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—rating method data. Journal of Mathematical Psychology, 6, 487–496.
Google Scholar
Dorfman, D., Berbaum, K., & Metz, C. (1992). Receiver Operating Characteristic rating analysis: Generalization to the population of readers and patients with the Jackknife method. Investigative Radiology, 27, 723–731.
Google Scholar
Draper, B. (1996). Learning grouping strategies for 2D and 3D object recognition. In Proceedings of the Image Understanding Workshop (pp. 1447–1454). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Draper, B. (1997). Learning control strategies for object recognition. In K. Ikeuchi, & M. Veloso (Eds.), Symbolic Visual Learning (pp. 49–76). New York, NY: Oxford University Press.
Google Scholar
Draper, B., Brodley, C., & Utgoff, P. (1994). Goal-directed classification using linear machine decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(9), 888–893.
Google Scholar
Duda, R., & Hart, P. (1973). Pattern Classification and Scene Analysis. New York, NY: John Wiley & Sons.
Google Scholar
Egan, J. (1975). Signal Detection Theory and ROC Analysis. New York, NY: Academic Press.
Google Scholar
Ezawa, K., Singh, M.,& Norton, S. (1996). Learning goal-oriented Bayesian networks for telecommunications risk management. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 139–147). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Fawcett, T., & Provost, F. (1997). Adaptive fraud detection. Data Mining and Knowledge Discovery, 1, 291–316.
Google Scholar
Fayyad, U., Smyth, P., Burl, M., & Perona, P. (1996). Learning to catalog science images. In S. Nayar, & T. Poggio (Eds.), Early Visual Learning (pp. 237–268). New York, NY: Oxford University Press.
Google Scholar
Firschein, O., & Strat, T. (Eds.). (1997). RADIUS: Image Understanding for Imagery Intelligence. San Francisco, CA: Morgan Kaufmann.
Google Scholar
Freund, Y., & Schapire, R. (1996). Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 148–156). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Freund, Y., Seung, H., Shamir, E., & Tishby, N. (1997). Selective sampling using the Query by Committee algorithm. Machine Learning, 28, 133–168.
Google Scholar
Green, D., & Swets, J. (1974). Signal Detection Theory and Psychophysics. New York, NY: Robert E. Krieger Publishing.
Google Scholar
Gros, P. (1993). Matching and clustering: Two steps towards automatic object model generation in computer vision. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 40–44). Menlo Park, CA: AAAI Press.
Google Scholar
Gutta, S., Huang, J., Imam, I., & Weschler,H. (1996). Face and hand gesture recognition using hybrid classifiers. In Proceedings of the Second International Conference on Automatic Face and Gesture Recognition (pp. 164–169). Los Alamitos, CA: IEEE Press.
Google Scholar
Hand, D., & Till, R. (2001).A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45, 171–186.
Google Scholar
Hanley, J., & McNeil, B. (1982). The meaning and use of the area under a Receiver Operating Characteristic (ROC) curve. Radiology, 143, 29–36.
Google Scholar
Hinkley, D. (1983). Jackknife methods. In S. Kotz, N. Johnson, & C. Read (Eds.), Encyclopedia of Statistical Sciences (Vol. 4, pp. 280–287). New York, NY: John Wiley & Sons.
Google Scholar
John, G., & Langley, P. (1995). Estimating continuous distributions in Bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338–345). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Keppel, G., Saufley, W., & Tokunaga, H. (1992). Introduction to Design and Analysis, 2nd edn. New York, NY: W.H. Freeman.
Google Scholar
Kim, Z., & Nevatia, R. (1999). Uncertain reasoning and learning for feature grouping. Computer Vision and Image Understanding, 76, 278–288.
Google Scholar
Kim, Z., & Nevatia, R. (2000). Learning Bayesian networks for diverse and varying numbers of evidence sets. In Proceedings of the Seventeenth International Conference on Machine Learning (pp. 479–486). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Kubat, M., Holte, R., & Matwin, S. (1996). Learning when negative examples abound. In Proceedings of the 1997 European Conference on Machine Learning (pp. 146–153). Berlin: Springer-Verlag.
Google Scholar
Kubat, M., Holte, R., & Matwin, S. (1998). Machine learning for the detection of oil spills in satellite images. Machine Learning, 30, 195–215.
Google Scholar
Kubat, M., & Matwin, S. (1997). Addressing the curse of imbalanced training sets: One-sided selection. In Proceedings of the Fourteenth International Conference on Machine Learning (pp. 179–186). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Langley, P., Iba, W., & Thompson, K. (1992). An analysis of Bayesian classifiers. In Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 223–228). Menlo Park, CA: AAAI Press.
Google Scholar
Langley, P., & Simon, H. (1995). Applications of machine learning and rule induction. Communications of the ACM, 38, 54–64.
Google Scholar
Levitt, T., Agosta, J., & Binford, T. (1989). Model-based influence diagrams for machine vision. In Proceedings of the Fifth Annual Conference on Uncertainty in Artificial Intelligence (pp. 371–388). New York, NY: Elsevier Science.
Google Scholar
Lewis, D., & Catlett, J. (1994). Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the Eleventh International Conference on Machine Learning (pp. 148–156). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Lin, C., & Nevatia, R. (1998). Building detection and description from a single intensity image. Computer Vision and Image Understanding, 72, 101–121.
Google Scholar
Maloof, M. (2000). An initial study of an adaptive hierarchical vision system. In Proceedings of the Seventeenth International Conference on Machine Learning (pp. 567–573). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Maloof, M. (2002). On machine learning, ROC analysis, and statistical tests of significance. In Proceedings of the Sixteenth International Conference on Pattern Recognition. Los Alamitos, CA: IEEE Press.
Google Scholar
Maloof, M., Beiden, S., & Wagner, R. (2002). Analysis of competing classifiers in terms of components of variance of ROC accuracy measures.Technical Report No. CS-02-01.Washington, DC: Department of Computer Science, Georgetown University. (http://www.cs.georgetown.edu/?maloof/pubs/cstr-02-01.html)
Google Scholar
Maloof, M., Duric, Z., Michalski, R., & Rosenfeld, A. (1996). Recognizing blasting caps in X-ray images. In Proceedings of the Image Understanding Workshop (pp. 1257–1261). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Maloof, M., Langley, P., Binford, T., & Nevatia, R. (1998). Generalizing over aspect and location for rooftop detection. In Proceedings of the Fourth IEEE Workshop on Applications of Computer Vision (pp. 194–199). Los Alamitos, CA: IEEE Press.
Google Scholar
Maloof, M., Langley, P., Binford, T., & Sage, S. (1998). Learning to detect rooftops in overhead imagery. Technical Report No. 98-1. Palo Alto, CA: Institute for the Study of Learning and Expertise.
Google Scholar
Maloof, M., Langley, P., Sage, S., & Binford, T. (1997). Learning to detect rooftops in aerial images. In Proceedings of the Image Understanding Workshop (pp. 835–845). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Maloof, M., & Michalski, R. (1997). Learning symbolic descriptions of shape for object recognition in X-ray images. Expert Systems with Applications, 12, 11–20.
Google Scholar
Metz, C. (1978). Basic principles of ROC analysis. Seminars in Nuclear Medicine, VIII:4, 283–298.
Google Scholar
Metz, C. (1989). Some practical issues of experimental design and data analysis in radiological ROC studies. Investigative Radiology, 24, 234–245.
Google Scholar
Michalski, R., Mozetic, I., Hong, J., & Lavrac, H. (1986). The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. In Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 1041–1045). Menlo Park, CA: AAAI Press.
Google Scholar
Miller, D., & Uyar, H. (1997). A mixture of experts classifier with learning based on both labeled and unlabeled data. In M. Mozer, M. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems (Vol. 9). Cambridge, MA: MIT Press.
Google Scholar
Mohan, R., & Nevatia, R. (1989). Using perceptual organization to extract 3-D structures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 1121–1139.
Google Scholar
Mossman, D. (1999). Three-way ROCs. Medical Decision Making, 19, 78–89.
Google Scholar
Nayar, S., & Poggio, T. (Eds.). (1996). Early Visual Learning. New York, NY: Oxford University Press.
Google Scholar
Noronha, S., & Nevatia, R. (1997). Detection and description of buildings from multiple aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 588–594). Los Alamitos, CA: IEEE Press.
Google Scholar
Osuna, E., Freund, R., & Girosi, F. (1997). Training Support Vector Machines: An application to face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 130–136). Los Alamitos, CA: IEEE Press.
Google Scholar
Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., & Brunk, C. (1994). Reducing misclassification costs. In Proceedings of the Eleventh International Conference on Machine Learning (pp. 217–225). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Pomerleau, D. (1996). Neural network vision for robot driving. In S. Nayar, & T. Poggio (Eds.), Early visual learning (pp. 161–181). New York, NY: Oxford University Press.
Google Scholar
Pope, A., & Lowe, D. (1996). Learning probabilistic appearance models for object recognition. In S. Nayar, & T. Poggio (Eds.), Early Visual Learning (pp. 67–97). New York, NY: Oxford University Press.
Google Scholar
Pope, A., & Lowe, D. (2000). Probabilistic models of appearance for 3-D object recognition. International Journal of Computer Vision, 40, 149–167.
Google Scholar
Provan, G., Langley, P., & Binford, T. (1996). Probabilistic learning of three-dimensional object models. In Proceedings of the Image Understanding Workshop (pp. 1403–1413). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Provost, F., & Fawcett,T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (pp. 43–48). Menlo Park, CA: AAAI Press.
Google Scholar
Provost, F., Fawcett, T., & Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. In Proceedings of the Fifteenth International Conference on Machine Learning (pp. 445–453). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Quinlan, J. (1993). C4.5: Programs for Machine Learning. San Francisco, CA: Morgan Kaufmann.
Google Scholar
Rowley, H., Baluja, S., & Kanade, T. (1996). Neural network-based face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 203–208). Los Alamitos, CA: IEEE Press.
Google Scholar
Sarkar, S., & Soundararajan, P. (2000). Supervised learning of large perceptual organization: Graph spectral partitioning and learning automata. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:5, 504–525.
Google Scholar
Segen, J. (1994). GEST: A learning computer vision system that recognizes hand gestures. In R. Michalski, & G. Tecuci (Eds.), Machine Learning: A Multistrategy Approach (Vol. 4, pp. 621–634). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Sengupta, K., & Boyer, K. (1993). Incremental model base updating: Learning new model sites. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 1–5). Menlo Park, CA: AAAI Press.
Google Scholar
Shepherd, B. (1983). An appraisal of a decision tree approach to image classification. In Proceedings of the Eighth International Joint Conference on Artificial Intelligence (pp. 473–475). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Soderland, S., & Lehnert, W. (1994). Corpus-driven knowledge acquisition for discourse analysis. In Proceedings of the Twelfth National Conference on Artificial Intelligence (pp. 827–832). Menlo Park, CA: AAAI Press.
Google Scholar
Swets, J. (1988). Measuring the accuracy of diagnostic systems. Science, 240, 1285–1293.
Google Scholar
Swets, J., & Pickett, R. (1982). Evaluation of Diagnostic Systems: Methods from Signal Detection Theory. New York, NY: Academic Press.
Google Scholar
Teller, A., & Veloso, M. (1997). PADO: A new learning architecture for object recognition. In K. Ikeuchi, & M. Veloso (Eds.), Symbolic Visual Learning (pp. 77–112). New York, NY: Oxford University Press.
Google Scholar
Thompson, M., & Zucchini, W. (1986). On the statistical analysis of ROC curves. Statistics in Medicine, 18, 452–462.
Google Scholar
Turney, P. (1995). Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm. Journal of Artificial Intelligence Research, 2, 369–409.
Google Scholar
Viola, P. (1993). Feature-based recognition of objects. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 60–64). Menlo Park, CA: AAAI Press.
Google Scholar
Wagner, R., Beiden, S., & Metz, C. (2001). Continuous versus categorical data for ROC analysis: Some quantitative considerations. Academic Radiology, 8, 328–334.
Google Scholar
Walpole, R., Myers, R., & Myers, S. (1998). Probability and Statistics for Engineers and Scientists, 6th edn. Upper Saddle River, NJ: Prentice-Hall.
Google Scholar
Wolpert, D. (1992). Stacked generalization. Neural Networks, 5, 241–259.
Google Scholar
Woods, K., Cook, D., Hall, L., Bowyer, K., & Stark, L. (1995). Learning membership functions in a function-based object recognition system. Journal of Artificial Intelligence Research, 3, 187–222.
Google Scholar
Woods, K., Kegelmeyer, W., & Bowyer, K. (1997). Combination of multiple classifiers using local accuracy estimates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:4, 405–410.
Google Scholar
Zurada, J. (1992). Introduction to Artificial Neural Systems. St. Paul, MN: West Publishing.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Georgetown University, Washington, DC, 20057, USA
M.A. Maloof
Institute for the Study of Learning and Expertise, 2164 Staunton Court, Palo Alto, CA, 94306, USA
P. Langley & S. Sage
Robotics Laboratory, Department of Computer Science, Stanford University, Stanford, CA, 94305, USA
T.O. Binford
Institute for Robotics and Intelligent Systems, School of Engineering, University of Southern California, Los Angeles, CA, 90089, USA
R. Nevatia

Authors

M.A. Maloof
View author publications
You can also search for this author in PubMed Google Scholar
P. Langley
View author publications
You can also search for this author in PubMed Google Scholar
T.O. Binford
View author publications
You can also search for this author in PubMed Google Scholar
R. Nevatia
View author publications
You can also search for this author in PubMed Google Scholar
S. Sage
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maloof, M., Langley, P., Binford, T. et al. Improved Rooftop Detection in Aerial Images with Machine Learning. Machine Learning 53, 157–191 (2003). https://doi.org/10.1023/A:1025623527461

Download citation

Issue Date: October 2003
DOI: https://doi.org/10.1023/A:1025623527461

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Improved Rooftop Detection in Aerial Images with Machine Learning

Abstract

Article PDF

Similar content being viewed by others

Built environment attributes and crime: an automated machine learning approach

Development of Remote Sensing Software Using a Boosted Tree Machine Learning Model Architecture for Professional and Citizen Science Applications

Machine Learning Based Urban Change Detection by Fusing High Resolution Aerial Images and Lidar Data

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Improved Rooftop Detection in Aerial Images with Machine Learning

Abstract

Article PDF

Similar content being viewed by others

Built environment attributes and crime: an automated machine learning approach

Development of Remote Sensing Software Using a Boosted Tree Machine Learning Model Architecture for Professional and Citizen Science Applications

Machine Learning Based Urban Change Detection by Fusing High Resolution Aerial Images and Lidar Data

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation