Skip to main content

Parallel Computations for Evolutionary Induction

  • Chapter
  • First Online:
Evolutionary Decision Trees in Large-Scale Data Mining

Part of the book series: Studies in Big Data ((SBD,volume 59))

Abstract

Top-down decision tree inducers are very fast, and even if the obtained decision structures are just good and not really close to the possible optimal solutions, it is often enough for practitioners who are interested in solving specific problems. The implementations of the most popular greedy algorithms are available in every data mining commercial system, and they can be very easily applied without any profound awareness of the parameter settings or running details. Moreover, knowledge of the existence of alternative induction methods is still limited to a narrow group of researchers who are working on this topic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Some results on the distributed global induction of decision trees within a multi-population island model can be found in [4].

  2. 2.

    The size of the table is equal to the number of objects in the learning dataset.

  3. 3.

    Symmetric multiprocessing (SMP)—multiprocessor machine with a shared main memory that is controlled by a single operating system instance that treats all the processors (cores) equally.

  4. 4.

    Avoiding the conditional statements allows for a reduction in the warp divergence.

  5. 5.

    It may be possible to simplify the second stage by temporal memorization of the training instance locations in a tree after the first stage. Indeed, this concept is implemented for the GPU-based model tree induction.

  6. 6.

    A radix sort algorithm is considered to be one of the fastest sorting algorithms on the GPU [16]. The state-of-the-art GPU-based implementation of it is available in the CUB library [17].

  7. 7.

    For linear algebra computations in the CUDA environment, highly optimized cuBLAS [18] and cuSOLVER [19] libraries are applied.

  8. 8.

    The number of GPUs that can be installed in one station is usually explicitly limited.

References

  1. Alba E (2005) Parallel metaheuristics: a new class of algorithms. Wiley, New York

    Book  Google Scholar 

  2. Alba E, Tomassini M (2002) IEEE Trans Evol Comput 6(5):443–462

    Article  Google Scholar 

  3. Gong Y, Chen W, Zhan Z, Zhang J, Li Y, Zhang Q, Li J (2015) Appl Soft Comput 34:286–300

    Article  Google Scholar 

  4. Kretowski M (2008) Obliczenia ewolucyjne w eksploracji danych. Globalna indukcja drzew decyzyjnych, Wydawnictwo Politechniki Bialostockiej

    Google Scholar 

  5. Alba E, Luque G, Nesmachnow S (2013) Int T Oper Res 20:1–48

    Article  Google Scholar 

  6. Tsutsui S, Collet P (2013) Massively parallel evolutionary computation on GPGPUs. Springer, Berlin

    Book  Google Scholar 

  7. Kretowski M, Grzes M (2007) Int J Data Wareh Min 3(4):68–82

    Article  Google Scholar 

  8. Kalles D, Papagelis A (2010) Soft Comput 14(9):973–993

    Article  Google Scholar 

  9. Kretowski M, Popczynski P (2008) Global induction of decision trees: from parallel implementation to distributed evolution. In: Proceedings of ICAISC’08. Lecture notes in artificial intelligence, vol 5097, pp 426–437

    Google Scholar 

  10. Czajkowski M, Jurczuk K, Kretowski M (2015) A parallel approach for evolutionary induced decision trees. MPI+OpenMP implementation. In: Proceedings of ICAISC’15. Lecture notes in artificial intelligence, vol 9119, pp 340–349

    Chapter  Google Scholar 

  11. Czajkowski M, Jurczuk K, Kretowski M (2016) Hybrid parallelization of evolutionary model tree induction. In: Proceedings of ICAISC’16. Lecture notes in artificial intelligence, vol 9692, pp 370–379

    Google Scholar 

  12. Kretowski M (2008) A memetic algorithm for global induction of decision trees. In: Proceedings of SOFSEM’08. Lecture notes in computer science, vol 4910, pp 531–540

    Google Scholar 

  13. Dua D, Karra Taniskidou E (2017) UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml

  14. Jurczuk K, Czajkowski M, Kretowski M (2017) Soft Comput 21:7363–79

    Article  Google Scholar 

  15. Jurczuk K, Czajkowski M, Kretowski M (2017) GPU-accelerated evolutionary induction of regression trees, In: Proceedings of TPNC’17. Lecture notes in computer science, vol 10687, pp 87–99

    Chapter  Google Scholar 

  16. Singh D, Joshi I, Choudhary J (2018) Int J Parallel Program 46(6):1017–1034

    Article  Google Scholar 

  17. Merrill D (2018) CUB v1.8.0 A library of warp-wide, block-wide, and device-wide GPU parallel primitives, NVIDIA Research. http://nvlabs.github.io/cub/

  18. NVIDIA (2018) cuBLAS, NVIDIA developer zone, CUDA toolkit documentation. https://docs.nvidia.com/cuda/cublas/

  19. NVIDIA (2018) cuSOLVER, NVIDIA developer zone, CUDA toolkit documentation. https://docs.nvidia.com/cuda/cusolver/

  20. NVIDIA (2018) CUDA C programming guide. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf

  21. Jurczuk K, Reska D, Kretowski M (2018) What are the limits of evolutionary induction of decision trees? In: Proceedings of PPSN XV. Lecture notes in computer science, vol 11102, pp 461–473

    Chapter  Google Scholar 

  22. Czajkowski M, Kretowski M (2014) Inform Sci 288:153–173

    Article  Google Scholar 

  23. Reska D, Jurczuk K, Kretowski M (2018) Evolutionary induction of classification trees on Spark. In: Proceedings of ICAISC’18. Lecture notes in artificial intelligence, vol 10841, pp 514–523

    Chapter  Google Scholar 

  24. The Apache Software Foundation (2019) Apache spark - lightning-fast cluster computing. http://spark.apache.org/

  25. Meng X et al (2016) J Mach Learn Res 17(1):1235–1241

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marek Kretowski .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kretowski, M. (2019). Parallel Computations for Evolutionary Induction. In: Evolutionary Decision Trees in Large-Scale Data Mining. Studies in Big Data, vol 59. Springer, Cham. https://doi.org/10.1007/978-3-030-21851-5_8

Download citation

Publish with us

Policies and ethics