Automatically Measuring the Quality of User Generated Content in Forums

Chai, Kevin; Wu, Chen; Potdar, Vidyasagar; Hayati, Pedram

doi:10.1007/978-3-642-25832-9_6

Kevin Chai²¹,
Chen Wu²¹,
Vidyasagar Potdar²¹ &
…
Pedram Hayati²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7106))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

2394 Accesses
6 Citations

Abstract

The amount of user generated content on the Web is growing and identifying high quality content in a timely manner has become a problem. Many forums rely on its users to manually rate content quality but this often results in gathering insufficient rating. Automated quality assessment models have largely evaluated linguistic features but these techniques are less adaptive for the diverse writing styles and terminologies used by different forum communities. Therefore, we propose a novel model that evaluates content, usage, reputation, temporal and structural features of user generated content to address these limitations. We employed a rule learner, a fuzzy classifier and Support Vector Machines to validate our model on three operational forums. Our model outperformed the existing models in our experiments and we verified that our performance improvements were statistically significant.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Understanding Peer Feedback Contributions Using Natural Language Processing

Web Credibility: Features Exploration and Credibility Prediction

Automated Assessment of the Quality of Peer Reviews using Natural Language Processing Techniques

Article 11 January 2017

Keywords

References

Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding High-Quality content in social media. In: Proceedings of the International Conference on Web Search and Web Data Mining (WSDM), pp. 183–194 (2008)
Google Scholar
Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A.F., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: An overview. Bioinformatics 16(5), 412–424 (2000)
Article Google Scholar
Chai, K.: A Machine Learning-based Approach for Automated Quality Assessment of User Generated Content in Web Forums. Ph.D. thesis, Curtin University (2011)
Google Scholar
Chai, K., Hayati, P., Potdar, V., Wu, C., Talevski, A.: Assessing post usage for measuring the quality of forum posts. In: Proceedings of the 4th IEEE International Conference on Digital Ecosystems and Technologies, DEST (2010)
Google Scholar
Chai, K., Potdar, V., Dillon, T.: Content Quality Assessment Related Frameworks for Social Media. In: Gervasi, O., Taniar, D., Murgante, B., Laganà, A., Mun, Y., Gavrilova, M.L. (eds.) ICCSA 2009. LNCS, vol. 5593, pp. 800–814. Springer, Heidelberg (2009)
Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning, p. 115 (1995)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proceedings of the 12th International Conference on Machine Learning, pp. 194–202 (1995)
Google Scholar
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the International Joint Conference on Uncertainty in Artifical Intelligence, pp. 1022–1027 (1993)
Google Scholar
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association 32(200), 675–701 (1937)
Article MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) Explorations 11(1) (2009)
Google Scholar
Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Tech. rep., National Taiwan University (2003), http://www.csie.ntu.edu.tw/cjlin/papers/guide/guide.pdf
Hühn, J., Hüllermeier, E.: FURIA: an algorithm for unordered fuzzy rule induction. Data Mining and Knowledge Discovery 19(3), 293–319 (2009)
Article MathSciNet Google Scholar
Jeon, J., Croft, W.B., Lee, J.H., Park, S.: A framework to predict the quality of answers with Non-Textual features. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 228–235 (2006)
Google Scholar
Lui, M., Baldwin, T.: You are what you post: User-level features in threaded discourse. In: Proceedings of the Fourteenth Australasian Document Computing Symposium (ADCS 2009), pp. 98–105 (2009)
Google Scholar
Nemenyi, P.: Distribution-free multiple comparisons. Ph.D. thesis, Princeton University (1963)
Google Scholar
Nussbaum, M.E., Hartley, K., Sinatra, G.M., Reynolds, R.E., Bendixe, L.D.: Enhancing the quality of On-Line discussions. In: Paper Presented at the Annual Meeting of the American Educational Research Association (2002)
Google Scholar
Platt, J.: Sequential minimal optimization: A fast algorithm for training support vector machines. Advances in Kernel Methods Support Vector Learning 208(MSR-TR-98-14), 1–21 (1998)
Google Scholar
Suryanto, M., Lim, E.P., Sun, A., Chiang, R.: Quality-Aware collaborative question answering: Methods and evaluation. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 142–151 (2009)
Google Scholar
Team, R.D.C.: R: A Language and Environment for Statistical Computing. Vienna, Austria (2011), http://www.R-project.org
Wanas, N., El-Saban, M., Ashour, H., Ammar, W.: Automatic scoring of online discussion posts. In: Proceeding of the 2nd ACM Workshop on Information Credibility on the Web, pp. 19–26 (2008)
Google Scholar
Weimer, M., Gurevych, I.: Predicting the perceived quality of web forum posts. In: Proceedings of the Conference on Recent Advances in Natural Language Processing (2007)
Google Scholar
Zhu, Z., Bernhard, D., Gurevych, I.: A Multi-Dimensional model for assessing the quality of answers in social Q&A sites. Tech. rep., Ubiquitous Knowledge Processing Lab (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Digital Ecosystems and Business Institute, Curtin University International Centre for Radio Astronomy Research, University of Western Australia, Australia
Kevin Chai, Chen Wu, Vidyasagar Potdar & Pedram Hayati

Authors

Kevin Chai
View author publications
You can also search for this author in PubMed Google Scholar
Chen Wu
View author publications
You can also search for this author in PubMed Google Scholar
Vidyasagar Potdar
View author publications
You can also search for this author in PubMed Google Scholar
Pedram Hayati
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering and Mathematical Sciences, La Trobe University, 3086, Melbourne, VIC, Australia
Dianhui Wang
School of Computer Science and Software Engineering, The University of Western Australia, 6009, Perth, WA, Australia
Mark Reynolds

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chai, K., Wu, C., Potdar, V., Hayati, P. (2011). Automatically Measuring the Quality of User Generated Content in Forums. In: Wang, D., Reynolds, M. (eds) AI 2011: Advances in Artificial Intelligence. AI 2011. Lecture Notes in Computer Science(), vol 7106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25832-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-25832-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25831-2
Online ISBN: 978-3-642-25832-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatically Measuring the Quality of User Generated Content in Forums

Abstract

Chapter PDF

Similar content being viewed by others

Understanding Peer Feedback Contributions Using Natural Language Processing

Web Credibility: Features Exploration and Credibility Prediction

Automated Assessment of the Quality of Peer Reviews using Natural Language Processing Techniques

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Automatically Measuring the Quality of User Generated Content in Forums

Abstract

Chapter PDF

Similar content being viewed by others

Understanding Peer Feedback Contributions Using Natural Language Processing

Web Credibility: Features Exploration and Credibility Prediction

Automated Assessment of the Quality of Peer Reviews using Natural Language Processing Techniques

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation