Abstract
The extraction and integration of data from multiples sources are required in current companies which manage their business process by heterogeneous collaborating applications. However, integrating web applications is an arduous task because they are intended for human consumption and they do not provide APIs to access to their data automatically.Web Information extractors are used for this purpose but, they mostly provide ad-hoc highly domain dependent solutions. In this paper we aim at devising Information Extractors with a FOIL based core algorithm. It is a widely used first order rule learning algorithm since their rules are substantially more expressive and allow to learn complex concepts that cannot be represented in the attribute-value format. Furthermore, we focus on integrating other scoring functions to check if we can improve the rule search guide speeding up the learning process in order to make FOIL tractable in real-world domains such as Web sources.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bratko, I.: Prolog Programming for Artificial Intelligence. In: McGettrick, A.D., Van Leeuwen, J. (eds.). Addison-Wesley (1986)
Fürnkranz, J.: FOSSIL: A Robust Relational Learner. In: Proc. of the Eur. Conf. on Mach. Learn. (1994), doi:10.1007/3-540-57868-4_54
Freitag, D.: Information Extraction from HTML: Application of a General Machine Learning Approach. In: Proc. Fifteenth Natl. Conf. on Artif. Intell., pp. 517–523 (1998)
Gomez, A.J., Fernandez, G.: Induccion de definiciones logicas a partir de relaciones: mejoras en los heuristicos del sistema FOIL. In: Congr. Nac. Program. Declar., pp. 292–302 (1992)
Hinton, G.E.: Learning distributed representations of concepts. In: Proc. of the Eighth Annu. Conf. of the Cogn. Sci. Soc., pp. 1–12 (1986)
Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications. In: Lavrac, N., Dzeroski, S. (eds.) Inductive Logic Programming, pp. 173–179. Hellis Horwood, New York (1994)
Lavrac, N., Flach, P.A., Zupan, B.: Rule Evaluation Measures: A Unifying View. In: Proc. of the 9th Int. Workshop on Inductive Log. Program. (1999), doi:10.1007/3-540-48751-4_17
Landwehr, N., Kersting, K., De Raedt, L.: nFOIL: Integrating Naïve Bayes and FOIL. In: The 20th Natl. Conf. on Artif. Intell., pp. 795–800 (2005)
Michalski, R.S.: Pattern recognition as rule-guided inductive inference. IEEE Trans. on Pattern Analysis and Mach. Intell. 2, 349–361 (1980)
Muggleton, S.: Inverse Entailment and Progol. New Gener. Comput. J. (1995), doi:10.1007/BF03037227
Pazzani, M.J., Kibler, D.F.: The Utility of Knowledge in Inductive Learning. Mach. Learn. 9, 57–94 (1992)
Quinlan, J.R., Cameron-Jones, R.M.: FOIL: A Midterm Report. In: Proc. of the Eur. Conf. on Mach. Learn. (1993), doi:10.1007/3-540-56602-3_124
Quinlan, J.R., Cameron-Jones, R.M.: Induction of Logic Programs: FOIL and Related Systems. New Gener. Comput. J. 13, 287–312 (1995)
Rissanen, J.: Universal coding, information, prediction, and estimation. IEEE Trans. Inf. Theory 30, 629–636 (1984)
Tan, P., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Inf. Syst. (2004), doi:10.1016/S0306-4379(03)00072-3
Winston, P.H.: Learning Structural Descriptions from Examples. In: Winston, P.H. (ed.) The Psychology of Computer Vision, pp. 157–209. McGraw-Hill, New York (1975)
Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical machine learning tools and techniques with Java implementations, pp. 9–13. Morgan Kauffman (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiménez, P., Arjona, J.L., Álvarez, J.L. (2012). On Relational Learning for Information Extraction. In: Rodríguez, J., Pérez, J., Golinska, P., Giroux, S., Corchuelo, R. (eds) Trends in Practical Applications of Agents and Multiagent Systems. Advances in Intelligent and Soft Computing, vol 157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28795-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-28795-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28794-7
Online ISBN: 978-3-642-28795-4
eBook Packages: EngineeringEngineering (R0)