Parallel Computations for Evolutionary Induction

Kretowski, Marek

doi:10.1007/978-3-030-21851-5_8

Marek Kretowski³

Part of the book series: Studies in Big Data ((SBD,volume 59))

575 Accesses
2 Citations

Abstract

Top-down decision tree inducers are very fast, and even if the obtained decision structures are just good and not really close to the possible optimal solutions, it is often enough for practitioners who are interested in solving specific problems. The implementations of the most popular greedy algorithms are available in every data mining commercial system, and they can be very easily applied without any profound awareness of the parameter settings or running details. Moreover, knowledge of the existence of alternative induction methods is still limited to a narrow group of researchers who are working on this topic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Some results on the distributed global induction of decision trees within a multi-population island model can be found in [4].
2.
The size of the table is equal to the number of objects in the learning dataset.
3.
Symmetric multiprocessing (SMP)—multiprocessor machine with a shared main memory that is controlled by a single operating system instance that treats all the processors (cores) equally.
4.
Avoiding the conditional statements allows for a reduction in the warp divergence.
5.
It may be possible to simplify the second stage by temporal memorization of the training instance locations in a tree after the first stage. Indeed, this concept is implemented for the GPU-based model tree induction.
6.
A radix sort algorithm is considered to be one of the fastest sorting algorithms on the GPU [16]. The state-of-the-art GPU-based implementation of it is available in the CUB library [17].
7.
For linear algebra computations in the CUDA environment, highly optimized cuBLAS [18] and cuSOLVER [19] libraries are applied.
8.
The number of GPUs that can be installed in one station is usually explicitly limited.

References

Alba E (2005) Parallel metaheuristics: a new class of algorithms. Wiley, New York
Book Google Scholar
Alba E, Tomassini M (2002) IEEE Trans Evol Comput 6(5):443–462
Article Google Scholar
Gong Y, Chen W, Zhan Z, Zhang J, Li Y, Zhang Q, Li J (2015) Appl Soft Comput 34:286–300
Article Google Scholar
Kretowski M (2008) Obliczenia ewolucyjne w eksploracji danych. Globalna indukcja drzew decyzyjnych, Wydawnictwo Politechniki Bialostockiej
Google Scholar
Alba E, Luque G, Nesmachnow S (2013) Int T Oper Res 20:1–48
Article Google Scholar
Tsutsui S, Collet P (2013) Massively parallel evolutionary computation on GPGPUs. Springer, Berlin
Book Google Scholar
Kretowski M, Grzes M (2007) Int J Data Wareh Min 3(4):68–82
Article Google Scholar
Kalles D, Papagelis A (2010) Soft Comput 14(9):973–993
Article Google Scholar
Kretowski M, Popczynski P (2008) Global induction of decision trees: from parallel implementation to distributed evolution. In: Proceedings of ICAISC’08. Lecture notes in artificial intelligence, vol 5097, pp 426–437
Google Scholar
Czajkowski M, Jurczuk K, Kretowski M (2015) A parallel approach for evolutionary induced decision trees. MPI+OpenMP implementation. In: Proceedings of ICAISC’15. Lecture notes in artificial intelligence, vol 9119, pp 340–349
Chapter Google Scholar
Czajkowski M, Jurczuk K, Kretowski M (2016) Hybrid parallelization of evolutionary model tree induction. In: Proceedings of ICAISC’16. Lecture notes in artificial intelligence, vol 9692, pp 370–379
Google Scholar
Kretowski M (2008) A memetic algorithm for global induction of decision trees. In: Proceedings of SOFSEM’08. Lecture notes in computer science, vol 4910, pp 531–540
Google Scholar
Dua D, Karra Taniskidou E (2017) UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml
Jurczuk K, Czajkowski M, Kretowski M (2017) Soft Comput 21:7363–79
Article Google Scholar
Jurczuk K, Czajkowski M, Kretowski M (2017) GPU-accelerated evolutionary induction of regression trees, In: Proceedings of TPNC’17. Lecture notes in computer science, vol 10687, pp 87–99
Chapter Google Scholar
Singh D, Joshi I, Choudhary J (2018) Int J Parallel Program 46(6):1017–1034
Article Google Scholar
Merrill D (2018) CUB v1.8.0 A library of warp-wide, block-wide, and device-wide GPU parallel primitives, NVIDIA Research. http://nvlabs.github.io/cub/
NVIDIA (2018) cuBLAS, NVIDIA developer zone, CUDA toolkit documentation. https://docs.nvidia.com/cuda/cublas/
NVIDIA (2018) cuSOLVER, NVIDIA developer zone, CUDA toolkit documentation. https://docs.nvidia.com/cuda/cusolver/
NVIDIA (2018) CUDA C programming guide. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf
Jurczuk K, Reska D, Kretowski M (2018) What are the limits of evolutionary induction of decision trees? In: Proceedings of PPSN XV. Lecture notes in computer science, vol 11102, pp 461–473
Chapter Google Scholar
Czajkowski M, Kretowski M (2014) Inform Sci 288:153–173
Article Google Scholar
Reska D, Jurczuk K, Kretowski M (2018) Evolutionary induction of classification trees on Spark. In: Proceedings of ICAISC’18. Lecture notes in artificial intelligence, vol 10841, pp 514–523
Chapter Google Scholar
The Apache Software Foundation (2019) Apache spark - lightning-fast cluster computing. http://spark.apache.org/
Meng X et al (2016) J Mach Learn Res 17(1):1235–1241
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
Marek Kretowski

Authors

Marek Kretowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marek Kretowski .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kretowski, M. (2019). Parallel Computations for Evolutionary Induction. In: Evolutionary Decision Trees in Large-Scale Data Mining. Studies in Big Data, vol 59. Springer, Cham. https://doi.org/10.1007/978-3-030-21851-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-21851-5_8
Published: 06 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21850-8
Online ISBN: 978-3-030-21851-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics