Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aschenbrenner, A. & Miksch, S. (2005). Blog Mining in a Corporate Environment. Smart Agent Technologies, Research Studio. Technical Report
Albert, R. & Barabási, A.-L. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics 74
Arvidson, A., Persson, K. & Mannerheim, J. (2000). The Kulturarw3 project - The Royal Swedish Web Archiw3e - An example of “complete” collection of web pages. Paper presented at the 66th IFLA - International Federation of Library Associations and Institutions, Jerusalem
Barabási, A.-L. & Bonabeau, E. (2003). Scale-free networks. Scientific American. 288
Rekhter, Y. & Li, T. (1995). A Border Gateway Protocol 4 (BGP-4). RFC 1771
Bharat, K., Chang, B.-W., Henzinger, M. & Ruhl, M. (2001). Who Links to Whom: Mining Linkage Between Web Sites. Paper presented at the IEEE International Conference on Data Mining (ICDM’01), San Jose, California
Bhowmick, S., Keong, N. & Madria, S. (2000). Web Schemas in WHOWEDA. Paper presented at the ACM 3rd International Workshop on Data Warehousing and OLAP, Washington, DC
Brin, S. & Page, L. (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems. 30(1-7)
Brody, T. & Hickman, I. (2000). Bibliometric Analysis: Mining the Social Life of an ePrint Archive. The Open Citation Project: User studies: mining Web logs and user surveys. http://opcit.eprints.org/ijh198/
Broido, A., Nemeth, E., & Claffy, K. C., (2002). Internet Expansion, Refinement, and Churn. European Transactions on Telecommunications 13 Dodge, M. (2004). An atlas of cyberspace. http://www.cybergeography.com/
Chakrabarti, S., Srivastava, S., Subramanyam, M. & Tiwari, M. (2000). Using Memex to Archive and Mine Community Web Browsing Experience. Paper presented at the 9th International World Wide Web Conference, Amsterdam.
Cooley, R., Mobasher, B., & Srivastava, J. (1997). Web Mining: Information and Pattern Discovery on the World Wide Web. Paper presented at the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’97), Newport Beach, CA
Cothey, V. (2002). A longitudinal study of World Wide Web users’ informationsearching behavior. Journal of the American Society for Information Science and Technology 53(2). ISSN 1532-2882
Covey, D. T. (2002). Usage and Usability Assessment: Library Practices and Concerns. CLIR Publication 105. Digital Library Federation, Washington
Czumaj, A., Krysta, P., & Vöcking, B. (2002). Selfish Traffic Allocation for Server Farms. Paper presented at the 34th Annual ACM Symposium on Theory of Computing, Montreal, Canada
Donato, D., Laura, L., Leonardi, S., & Millozzi, S. (2004). Large scale properties of the Webgraph. European Journal of Physics B. 38
Flake, G. W., Lawrence, S., Giles, C. L., & Coetzee, F. M. (2002). Self-Organization and Identification of Web Communities. IEEE Computer 35(3)
Flake, G. W., Tsioutsiouliklis, K., & Zhukov, L. (2003). Methods for Mining Web Communities: Bibliometric, Spectral, and Flow. In Poulovassilis, A., Levene, M. (Eds.), Web Dynamics. Springer, Berlin Heidelberg New York
Gross, J. (2003). Learning by Doing: the Digital Archive for Chinese Studies (DACHS). Paper presented at the 3rd ECDL Workshop on Web Archives. Trondheim, Norway
Hirai, J., Raghavan, S., Garcia-Molina, H., & Paepcke, A. (2000). Webbase: A Repository of Web Pages. Paper presented at the 9th International World Wide Web Conference (WWW9). Amsterdam, The Netherlands. Elsevier Science
Cheswick, B. & Burch H. (2004). Lumeta Corp.: Internet Mapping Project. http://research.lumeta.com/ches/db/
Hallam-Baker, P. M. & Behlendorf, B. (1996). Extended Log File Format. W3C Working Draft, WD-logfile-960323
Inmon, W. (1992). Building the Data Warehouse. Wiley, New York
Kimball, R. (2002). The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley, New York
Knees, P., Pampalk, E., & Widmer, G. (2005). Automatic Classification of Musical Artists based on Web-Data. ÖGAI Journal 24(1)
Kumar, R., Raghavan, P., Rajagopalan, S., & Tomkins, A. (1999). Trawling the Web for Emerging Cyber-Communities. Computer Networks 31(11)
Kushmerick, N. (2000). Wrapper Induction: Efficiency and expressiveness. Artificial Intelligence 118(1-2)
Leung, S., Perl, S., Stata, R., & Wiener, J. (2001). Towards Web-scale Web Archaeology. Research Report 174. Compaq Systems Research Center, Palo Alto, CA
Mobasher, B. (2004). Web Usage Mining and Personalization. In Singh, P.M. (Ed.), Practical Handbook of Internet Computing. CRC, West Palm Beach, FL, USA
Murray, D. & Durrell, K. (2000). Inferring demographic attributes of anonymous Internet users. Lecture Notes in Artificial Intelligence 1836. Springer, Berlin Heidelberg New York
Odlyzko, A. M. (2003). Internet traffic growth: Sources and implications. In Dingel, B., Weiershausen, W., Dutta, A. K., Sato, K.-I. (Eds.), Optical Transmission Systems and Equipment for WDM (Wavelength-Division Multiplexing) Networking II. SPIE (The International Society for Optical Engineering), 5247
Pennock, D., Flake, G., Lawrence, S., Glover, E., & Lee Giles, C. (2002). Winners Don’t Take All: Characterizing the Competition for Links on the Web. Proceedings of the National Academy of Sciences 99(8)
Perkowitz, M. & Etzioni, O. (1997). Adaptive Sites: Automatically Learning from user Access Patterns. Paper presented at the 6th International World Wide Web Conference, Santa Clara, CA
Phillips, M. (2003). Balanced Scorecard Initiative 49 - Collecting Australian Online Publications. Version 6. National Library of Australia
Rauber, A. & Aschenbrenner, A. (2001). Part of Our Culture is Born Digital - On Efforts to Preserve it for Future Generations. TRANS. On-line Journal for Cultural Studies (Internet-Zeitschrift für Kulturwissenschaften) 10. INST
Rauber, A., Aschenbrenner, A., & Witvoet, O. (2002). Austrian On-Line Archive Processing: Analyzing Archives of the World Wide Web. Paper presented at the 6th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2002), Rome, Italy. Springer, Berlin Heidelberg New York
Reid, E. (2003). Identifying a Company’s Non-Customer Online Communities: a Proto-typology. Paper presented at the IEEE Hawaiian International Conference On System Sciences (HICSS 2003), Big Island, Hawaii
Siganos, G., Faloutsos, M., & Faloutsos, C. (2002). The Evolution of the Internet: Topology and Routing. Technical Report 65. Carnegie Mellon University, Department of Computer Science
Toyoda, M. & Kitsuregawa, M. (2003). Extracting Evolution of Web Communities from a Series of Web Archives. Paper presented at the 14th ACM conference on Hypertext and Hypermedia, Nottingham, UK. ACM, New York
Vázquez, A., Pastor-Satorras, R., & Vespignani, A. (2002). Large-scale topological and dynamical properties of the Internet. Physical Review E65(066130), American Physical Society
Zaiane, O. R. (1999). Resource and Knowledge Discovery from the Internet and Multimedia Repositories. PhD thesis (Simon Fraser University)
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Aschenbrenner, A., Rauber, A. (2006). Mining Web Collections. In: Web Archiving. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-46332-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-46332-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23338-1
Online ISBN: 978-3-540-46332-0
eBook Packages: Computer ScienceComputer Science (R0)