Academic Analytics Implemented for Students Performance in Terms of Canonical Correlation Analysis and Chi-Square Analysis

Muley, Aniket; Bhalchandra, Parag; Joshi, Mahesh; Wasnik, Pawan

doi:10.1007/978-981-10-5508-9_26

Aniket Muley¹⁷,
Parag Bhalchandra¹⁸,
Mahesh Joshi¹⁹ &
…
Pawan Wasnik¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 625))

887 Accesses
2 Citations

Abstract

In this research study, we were interested to test the significant association between selected variables which otherwise called as invisible and have indirect impact on the performance of the students. We have devised out our own dataset for the experimental purpose. Our study has made these variables and their relationship visible. The results enable us to determine characteristics of learning environment related to performance.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Statistical Analysis of University Academic Performance in the Area of Exact Sciences, Before and During the Covid-19 Pandemic

Alternative Assessment of Performance of Students Through Data Mining

Accessing Individual Students Academic Performance Using Random Effect Analysis (Multilevel Analysis)

Article 21 September 2016

Keywords

1 Introduction

Academic analytics is one branch of modern day’s data analysis which uses statistical analysis and data mining methods to reveal and recognize hidden patterns in vast educational databases [1,2,3,4,5,6]. Such patterns enable us to throw better light on educational aspects related to student behavior, prognostication, student-centric learning, remedial aspects, and learning outcome with high accuracy. This will definitely increase standards of Indian higher educational system [6]. Due to digitization and effective use of computers, IT and ICT technologies, all educational organizations, institutions, and universities have generated and stored large data in their databases [7–13]. This data can be a key source for futuristic decision making processes if it is being processed through academic analytics. We took it as a challenge to see all the business intelligence, patterns, correlations, and rules embedded in this data. Our work is an interdisciplinary work undertaken by three schools of our university as performance analysis shares sphere with educational pedagogies, statistics, and computer-enabled technologies. The academic analytics was implemented using SPSS software [14, 15].

A closed questionnaire with predefined answers was used for data gathering [16] on A4 size single-sided paper sheet. Performance-related economical, social, and emotional attributes of this questionnaire were selected with the help of School of Educational Sciences and as per theory of Pritchard and Wilson [16, 17]. The questionnaire was modified number of times to reduce the complexity of understanding as well as to increase simplicity of answering. It was tested on subset of students after every revision. An Excel sheet was prepared for the answers using code such as 0, 1, 2. The confidential issues in datasets were properly addressed as dataset carried personal information of students. The error rate during preprocessing was 38% which finally reduced to 5% after proper convincing to students. The questionnaire looks like Figs. 1 and 2.

2 Experimentations and Discussion

Our aim was to discover invisible attributes related to performance of students. So we had discussions with educationalist and then finally understood that the semester end marks alone cannot be taken as main indicator of student’s performance. The performance is indistinct term. For proper knowledge, surveyed literatures such as Shoukat Ali et al. [4], Graetz [18], Considine and Zappala [19], and Bratti and Staffolani [20]. This analysis is helpful for identifying the personal, social and economic kinds attributes in our study.

With these preliminary investigations and understanding, we decided to identify key variables that accelerates or downgrades educational performance at large. We had thought that economical and social conditions of students can be important variables from our dataset/questionnaire as far as performance was concerned. To do so, many variables and their interrelations needed to be analyzed for proper analysis. It is always true for questionnaires as they consist of many questions, such that each question contributes for one variable [7, 21–23]. Studying all variables and their interrelation may be complicated as they may divert us from the original research focus. For such exploratory analysis, factor analysis has been invented [22]. We have used SPSS22.0v to analyze the data set. The snapshots given below show the evidence of empirical analysis. The descriptive statistics are used through MS-Excel to represent our data in the diagrammatic form. Some of the interesting facts are shown in Figs. 3, 4, 5 and 6. Further, canonical correlation analysis and chi-square testing have been done on the experimental data set.

2.1 Program Code

The SPSS22.0v is used to analyze the data set [16].

FREQUENCIES VARIABLES=GENDER MARRIED AGE REGION UG FEDU FJOB
FINCOM MEDU MJOB MINCOM FSIZE FRELATIONS FSUPPORT REASON TMODE
TTIME STIME FAILURES TUTORIAL SCHOLERSHIP PJOB MM HARDSUB_UG
STUDY_HOME SELFLIB SELFPC PLACELVING INTERNET F_T_STUDY
F_T_FRIEND MOVIEPWEEK CAREERDREM PARALLELCOURSE OWN_NOTES
FREE_T_ACC PER_SATISF MATERIAL HLT_STATUS
/ORDER=ANALYSIS.
CROSSTABS
/TABLES=REGION GENDER BY FAILURES STIME SCHOLERSHIP PJOB
SELFLIB SELFPC PLACELVING INTERNET F_T_STUDY F_T_FRIEND
MOVIEPWEEK OWN_NOTES PER_SATISF MATERIAL
/FORMAT=AVALUE TABLES
/STATISTICS=CHISQ
/CELLS=COUNT
/COUNT ROUND CELL.

The use of descriptive statistics has been made using MS-Excel to represent our data in the diagrammatic form. Figures 3, 4, 5 and 6 show the distribution of the data according to region-wise classification, diversity of parents jobs, education-wise, and their family size-wise, respectively. The students came from urban and rural backgrounds are found to be approximately same of Indian students as compared to foreign students. The discrimination in the student’s performance is observed according to their parent’s job and educational background. Also, numbers of family members in student’s family were represented through the bar plot. The interesting facts are shown in Figs. 3, 4, 5 and 6.

2.2 Canonical Correlation Analysis

The core objective is to find relationship between personal details with family background. We made two groups for proper analysis. The first group is student’s details containing three parameters, viz. gender, age, and UG percentage. The second group is his/her family background and the parameters chosen are: father’s education, father’s job, father’s income, mother’s education, mother’s job, mother’s income, family size, and whether student does any part-time job? Here, Canonical correlation analysis is used to find the significant relationship between student’s details and his family background to determine the associations among two sets of variables. Our observations gave us significant outcomes.

2.3 Chi-Square Analysis

Sample analysis using chi-square tests is mentioned here. Similar way, the results were computed and it has been represented in the form of conclusion. Below figures and tables show the use of descriptive statistics. These together show some data regarding diversity of the students according to course-wise, gender-wise, undergraduate background, father’s occupation, and their family size. We have applied chi-square test to test the significance among the above objectives and assumptions that there will be significant difference among the variables under study.

Some of the parameters which show significant differences in our study are as scholarship holder students with gender-wise; difference gender-wise about their career dreams; between gender-wise percentages obtained at UG level, between region-wise percentages obtained at UG level by the students; between age group-wise obtained scholarships; between age group-wise obtained UG Percentage; students and their father’s education; students and their father’s job; between gender-wise and their mother’s education; age-wise and their family size; age-wise and part-time job; region-wise and father’s education; region-wise and family size; students place of living and self library. Further, we have made analysis using chi-square Tests with the help of SPSS 22.0 software [15] and found some significant results. These are represented in the form of tables. According to region-wise study with respect to variables like place of living, do they have their own PC? Do they use internet? How much free time they have for study? It was surprising to note that there are significant differences with respect to student’s living places. These differences came because of student’s awareness to use internet. Our students are from computer science field, and hence, it is expected that they must frequently use internet. From our experimental analysis, it is found to be true. While dealing with students free time for study perspective, it has been observed that there is significant enough good time is available for study. It was assumed that in due course of studentship, he/she may get sufficient time for study rather than doing any other work. This particularly holds true as the Nanded region is not a metro city or an industrial hub. When we did gender-wise study with a variable, how much scholarship they get? It is observed that there is significance difference. Male students get more scholarship than female students. Also, we found significance among gender-wise difference in their place of living. Most of the female students preferred to live at own home or in hostels due to safety issues. Tables 1, 2, 3, 4 and 5 show these results.

Table 1 Chi-square tests analysis for region versus students having self PC

Full size table

Table 2 Chi-square tests analysis for region versus place of living

Full size table

Table 3 Chi-square tests analysis for region versus free time to study

Full size table

Table 4 Chi-square tests analysis for region versus students having self PC

Full size table

Table 5 Chi-square tests analysis for gender versus place of living

Full size table

3 Conclusion

The performance of the student is fuzzy terms and it is affected by many parameters. In this study, our data reveal that it is due to the social and economical condition of students. However, no scientific evidences were there for such outcome. The study took it as challenge and it has been discovered that the student’s performance mere did not depend on his/her studious nature. This paper shows effective use of academic analytics in terms of descriptive statistics. Here, we have applied canonical correlation analysis and chi-square test to test the significance among the stated objectives and assumptions. We have finally discovered new variables, which otherwise were invisible that hampers performance of students.

References

Dunham Margaret H.: Data Mining: Introductory and Advanced Topics, Pearson publications (2002)
Google Scholar
Han, J. and Kamber, M., Data Mining: Concepts and Techniques, 2nd edition. The Morgan Kaufmann Series in Data Management Systems, Jim Gray. (2006)
Google Scholar
Behrouz et.al.: Predicting Student Performance: An Application of Data Mining Methods With The Educational Web-Based System Lon-CAPA IEEE, Boulder, CO. (2003)
Google Scholar
Shoukat Ali et al.: Factors Contributing to the Students Academic Performance: A Case Study of Islamia University Sub-Campus, American Journal of Educational Research, 1 (8), pp. 283–289 (2013)
Google Scholar
Gordon Linoff, Michael J, et al.: Data Mining Techniques, 3Ed, Wiley Publications.
Google Scholar
Eko Indrato: edited notes on Data Mining, retrieved from http://recommender-systems.readthedocs.org/en/latest/datamining.html
Ma, Y., Liu, B., Wong, C. K., Yu, P. S., & Lee, S. M.: Targeting the right students using data mining. Paper presented at the Sixth ACM SIGKDD International Conference Proceedings, Boston, MA, pp. 457–464 (2000)
Google Scholar
Minaei-Bidgoli, B., Kashy, D. A., Kortemeyer G. and, Punch, W. F.: Predicting student performance: an application of data mining methods with the educational web-based system LON-CAPA, In Proceedings of ASEE/IEEE Frontiers in Education Conference, Boulder, CO: IEEE (2003)
Google Scholar
Kotsiantis S.: Educational Data Mining: A Case Study for Predicting Dropout – Prone Students. Int. J.Knowledge Engineering and Soft Data Paradigms, 1(2), pp. 101–111 (2009)
Google Scholar
Berkhin Pavel: Survey of Clustering Data Mining Techniques, Accrue Software, available at www.cc.gatech.edu/~isbell/reading/papers/berkhin02survey.pdf
Sasirekha K., Baby, P.: Agglomerative Hierarchical Clustering Algorithm- A Review, International Journal of Scientific and Research Publications, 3(3), pp. 83 (2013)
Google Scholar
Nikhil Rajadhayx et al.: Data mining in Educational Domain, retrieved from http://arxiv.org/pdf/1207.1535.pdf
Murugesan Keerthiram, Zhang Jun: Hybrid Hierarchical Clustering: An Experimental Analysis, Technical Report: CMIDA-hipsccs#001–11, retrieved from www.cs.uky.edu/~jzhang/pub/techrep.html
Field, A.: Discovering Statistics using SPSS for Windows. London–Thousand Oaks – New Delhi: Sage publications. (2000)
Google Scholar
IBM SPSS Statistics 22 Documentation on internet retrieved at www.ibm.com/support/docview.wss?uid=swg27038407
Cortez Paulo and Silva Alice: Using Data Mining to Predict Secondary School Student Performance, retrieved from http://www.researchgate.net/publication/ Using_data_mining _to_ predict_secondary_ school_ student_ performance
Pritchard, M. E., and Wilson, G. S.: Using emotional and social factors to predict student success. Journal of College Student Development 44(1): pp. 18–28. (2003)
Google Scholar
Graetz, B.: Socio-economic status in education research and policy in John Ainley et al., Socio-economic Status and School Education DEET/ACER Canberra., J Pediatr Psychol. 20(2):205–216 (1995)
Google Scholar
Considine, G. & Zappala, G.: Influence of social and economic disadvantage in the academic performance of school students in Australia. Journal of Sociology, 38, 129–148 (2002)
Google Scholar
Bratti, M. and Staffolani, S.: Student Time Allocation and Educational Production Functions, University of Ancona Department of Economics Working Paper No. 170 (2002)
Google Scholar
Introduction to factor analysis, web resource www.yorku.ca/ptryfos/f1400.pdf
Rietveld, T. & Van Hout R.: Statistical Techniques for the Study of Language and Language Behaviour. Berlin – New York: Mouton de Gruyter (1993)
Google Scholar
Habing, B.: Exploratory Factor Analysis. Website: http://www.stat.sc.edu/~habing/courses/530EFA.pdf (accessed 10 May 2004) (2003)

Download references

Author information

Authors and Affiliations

School of Mathematical Sciences, S.R.T.M. University, Nanded, 431606, Maharashtra, India
Aniket Muley
School of Computational Sciences, S.R.T.M. University, Nanded, 431606, Maharashtra, India
Parag Bhalchandra & Pawan Wasnik
School of Educational Sciences, S.R.T.M. University, Nanded, 431606, Maharashtra, India
Mahesh Joshi

Authors

Aniket Muley
View author publications
You can also search for this author in PubMed Google Scholar
Parag Bhalchandra
View author publications
You can also search for this author in PubMed Google Scholar
Mahesh Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Pawan Wasnik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aniket Muley .

Editor information

Editors and Affiliations

Microsoft Innovation Centre, Sri Aurobindo Institute of Technology Microsoft Innovation Centre, Indore, Madhya Pradesh, India
Durgesh Kumar Mishra
Faculty of Computers and Information, Banha University Faculty of Computers and Information, Banha, Egypt
Ahmad Taher Azar
Sabar Institute of Technology, Asst. Prof., Dept. of Info. Tech. Sabar Institute of Technology, Gujarat, Gujarat, India
Amit Joshi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muley, A., Bhalchandra, P., Joshi, M., Wasnik, P. (2018). Academic Analytics Implemented for Students Performance in Terms of Canonical Correlation Analysis and Chi-Square Analysis. In: Mishra, D., Azar, A., Joshi, A. (eds) Information and Communication Technology . Advances in Intelligent Systems and Computing, vol 625. Springer, Singapore. https://doi.org/10.1007/978-981-10-5508-9_26

Download citation

DOI: https://doi.org/10.1007/978-981-10-5508-9_26
Published: 13 October 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5507-2
Online ISBN: 978-981-10-5508-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Academic Analytics Implemented for Students Performance in Terms of Canonical Correlation Analysis and Chi-Square Analysis

Abstract

Similar content being viewed by others

Statistical Analysis of University Academic Performance in the Area of Exact Sciences, Before and During the Covid-19 Pandemic

Alternative Assessment of Performance of Students Through Data Mining

Accessing Individual Students Academic Performance Using Random Effect Analysis (Multilevel Analysis)

Keywords

1 Introduction

2 Experimentations and Discussion

2.1 Program Code

2.2 Canonical Correlation Analysis

2.3 Chi-Square Analysis

3 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Academic Analytics Implemented for Students Performance in Terms of Canonical Correlation Analysis and Chi-Square Analysis

Abstract

Similar content being viewed by others

Statistical Analysis of University Academic Performance in the Area of Exact Sciences, Before and During the Covid-19 Pandemic

Alternative Assessment of Performance of Students Through Data Mining

Accessing Individual Students Academic Performance Using Random Effect Analysis (Multilevel Analysis)

Keywords

1 Introduction

2 Experimentations and Discussion

2.1 Program Code

2.2 Canonical Correlation Analysis

2.3 Chi-Square Analysis

3 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation