Evaluating the Impact of Dataset Size on Univariate Prediction Techniques for Moroccan Agriculture

Ed-daoudi, Rachid; Alaoui, Altaf; Zerouaoui, Jad; Ettaki, Badia; Zerouaoui, Jamal

doi:10.1007/978-3-031-26254-8_57

Rachid Ed-daoudi¹³,
Altaf Alaoui¹³,
Jad Zerouaoui¹³,
Badia Ettaki^13,14 &
…
Jamal Zerouaoui¹³

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 635))

Included in the following conference series:

The International Conference on Artificial Intelligence and Smart Environment

Abstract

Learning models used for prediction are mostly developed without taking into account the size of datasets that can produce models of high accuracy and better performance. Although, the general believe is that, large dataset is needed to construct a predictive learning model. To describe a data set as large in size depends on the circumstances and context of prediction. This means that what makes a dataset to be considered as being big or small is controversial. In this paper, the ability of the predictive model to adapt to a particular size of data in training is examined. The study experiments on three different sizes of Moroccan agricultural data using a variety of statistical and Machine Learning techniques, to create predictive models with a view to establishing if the size of data has any effect on the accuracy of a model. The output of each model is measured using the Mean Absolute Error (MAE) and r-squared, and comparisons are made. The results of training the models through the three partitioned dataset show that, the models trained with the smallest and largest size of training data appear to be less accurate, while the models trained with a medium sized dataset delivers a much better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards crop yield estimation at a finer spatial resolution using machine learning methods over agricultural regions

Article 06 October 2021

Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil

Article 07 September 2022

An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning

Article 08 January 2019

References

Bandyopadhyay, G., Chattopadhyay, S.: Single hidden layer artificial neural network models versus multiple linear regression model in forecasting the time series of total ozone. Int. J. Environ. Sci. Technol. 4, 141–149 (2007). https://doi.org/10.1007/BF03325972
Article Google Scholar
Basavanhally, A., Doyle, S., Madabhushi, A.: Predicting classifier performance with a small training set: applications to computer-aided diagnosis and prognosis. Paper presented at the 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro (2010)
Google Scholar
Dobbin, K.K., Simon, R.M.: Sample size planning for developing classifiers using high-dimensional DNA microarray data. Biostatistics 8(1), 101–117 (2007)
Article MATH Google Scholar
Ed-daoudi, R., Alaoui, A., Ettaki, B., Zerouaoui, J.: A review of prediction techniques in some domains of human activity. J. Comput. Sci. Submitted in 2022
Google Scholar
Haykin, S.: Neural Networks and Learning Machines. Pearson Education, Upper Saddle River (2009)
Google Scholar
McArthur, D.P., Encheva, S., Thorsen, I.: Predicting with a small amount of data: an application of fuzzy reasoning to regional disparities. J. Econ. Stud. 41, 12–28 (2013)
Article Google Scholar
Mukherjee, S., Tamayo, P., Rogers, S., Rifkin, R., Engle, A.: Estimating dataset size requirements for classifying DNA microarray data. J. Comput. Biol. 10, 119–142 (2003)
Article Google Scholar
Oladokun, V., Adebanjo, A., Charles-Owaba, O.: Predicting students’ academic performance using artificial neural network: a case study of an engineering course. Pac. J. Sci. Technol. 9(1), 72–79 (2008)
Google Scholar
Osmanbegović: Data mining approach for predicting student performance (2012)
Google Scholar
Özel, T., Karpat, Y.: Predictive modeling of surface roughness in hard turning using regression and neural networks. Int. J. Mach. Tools Manuf. 45, 467–479 (2005)
Article Google Scholar
Skillicorn, D.: Understanding datasets: data mining with matrix decompositions (2007)
Google Scholar
Suh, S.C.: Practical Applications of Data Mining. Jones & Bartlett Learning, Burlington (2012)
Google Scholar
van der Ploeg, T., Austin, P.C., Steyerberg, E.W.: A simulation study for predicting dichotomous endpoints. BMC Med. Res. Methodol. 14(1), 137 (2014)
Article Google Scholar
Witten, I.H., Frank, E., Hall, M.A.: Data Mining Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)
Google Scholar
Cuesta, E., Kirane, M., Malik, S.A.: Image structure preserving denoising using generalized fractional time integrals. Signal Process. 92, 553–563 (2012)
Article Google Scholar
Shen, S., Liu, F., Anh, V., Turner, I.: Detailed analysis of a conservative difference approximation for the time fractional diffusion equation. J. Appl. Math. Comput. 22, 1–19 (2006)
Article MathSciNet MATH Google Scholar
Alikhanov, A.A.: A new difference scheme for the time fractional diffusion equation. Comput. Phys. 280, 424–438 (2015)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Engineering Sciences and Modeling, Faculty of Sciences, Ibn Tofail University, Campus Universitaire, BP 133, Kenitra, Morocco
Rachid Ed-daoudi, Altaf Alaoui, Jad Zerouaoui, Badia Ettaki & Jamal Zerouaoui
LyRICA: Laboratory of Research in Computer Science, Data Sciences and Knowledge Engineering, School of Information Sciences Rabat, Rabat, Morocco
Badia Ettaki

Authors

Rachid Ed-daoudi
View author publications
You can also search for this author in PubMed Google Scholar
Altaf Alaoui
View author publications
You can also search for this author in PubMed Google Scholar
Jad Zerouaoui
View author publications
You can also search for this author in PubMed Google Scholar
Badia Ettaki
View author publications
You can also search for this author in PubMed Google Scholar
Jamal Zerouaoui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rachid Ed-daoudi .

Editor information

Editors and Affiliations

Department of Computer Science, Moulay Ismail University, Faculty of Sciences and Techniques, Errachidia, Morocco
Yousef Farhaoui
ISEG, Universidade de Lisboa, Lisbon, Cávado, Portugal
Alvaro Rocha
University of Sfax, Sfax, Tunisia
Zouhaier Brahmia
School of Engineering and Technology (SET), Sharda University, Greater Noida, Uttar Pradesh, India
Bharat Bhushab

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ed-daoudi, R., Alaoui, A., Zerouaoui, J., Ettaki, B., Zerouaoui, J. (2023). Evaluating the Impact of Dataset Size on Univariate Prediction Techniques for Moroccan Agriculture. In: Farhaoui, Y., Rocha, A., Brahmia, Z., Bhushab, B. (eds) Artificial Intelligence and Smart Environment. ICAISE 2022. Lecture Notes in Networks and Systems, vol 635. Springer, Cham. https://doi.org/10.1007/978-3-031-26254-8_57

Download citation

DOI: https://doi.org/10.1007/978-3-031-26254-8_57
Published: 08 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26253-1
Online ISBN: 978-3-031-26254-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Evaluating the Impact of Dataset Size on Univariate Prediction Techniques for Moroccan Agriculture

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Towards crop yield estimation at a finer spatial resolution using machine learning methods over agricultural regions

Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil

An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Evaluating the Impact of Dataset Size on Univariate Prediction Techniques for Moroccan Agriculture

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Towards crop yield estimation at a finer spatial resolution using machine learning methods over agricultural regions

Potential Use of Data-Driven Models to Estimate and Predict Soybean Yields at National Scale in Brazil

An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation