Angewandtes Text Mining im Kontext der Nachhaltigkeitsforschung am Beispiel der deutschen Forschungslandkarte der Hochschulrektorenkonferenz

Bickel, Manuel W.; Liedtke, Christa

doi:10.1007/978-3-662-61534-8_8

Manuel W. Bickel³ &
Christa Liedtke³

Part of the book series: Theorie und Praxis der Nachhaltigkeit ((TPN))

9834 Accesses
1 Citations

Zusammenfassung

Das Potential von Methoden des maschinellen Lernens wird bisher im Bereich der Nachhaltigkeitsforschung nicht vollständig ausgeschöpft. Die computergestützte explorative Datenanalyse mittels statistischer Methoden und verschiedener Algorithmen bietet vielversprechende Analysemöglichkeiten, unter anderem um Daten zu strukturieren, Muster zu erkennen, Bewertungen vorzunehmen, neue Perspektiven zu eröffnen und Hypothesen zu generieren. Diese Kurzstudie zeigt beispielhaft die Anwendung von Text Mining und nachgelagerten Methoden auf und analysiert dazu Informationen, die in der Datenbank der Forschungslandkarte der Hochschulrektorenkonferenz zur Verfügung gestellt werden. Konkret wird ein sogenannter Topic Modeling Ansatz verfolgt, um auf Basis der textlichen Beschreibung der Hochschulprofile automatisiert Forschungsfelder zu identifizieren und unter anderem über eine Clusteranalyse zu strukturieren und zu bewerten. Des Weiteren erfolgt eine Netzwerkanalyse der identifizierten Themen, um aufzuzeigen wie potentielles Kollaborationspotential zwischen Hochschulen aufgedeckt werden könnte. Anhand dieser beispielhaften Studie werden Grenzen des Ansatzes aufgezeigt, insbesondere in Bezug auf die Datenqualität sowie die Interpretierbarkeit der Ergebnisse. Abschließend werden Chancen für die Nachhaltigkeitsforschung in Deutschland aufgezeigt, die sich aus dieser Art von Informationsflussanalyse auf textlicher Basis ergeben.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Text Mining sozioökonomischer Lehrbuchinhalte

Text Mining in der Organisationsforschung

Einleitung Text Mining in den Sozialwissenschaften

Literatur

Becker, R. M., Chambers, J. M., & Wilks, A. R. (1988). The new S language data analysis: A programming environment for data analysis and graphics. Pacific Grove: Wadsworth & Brooks/Cole.
Google Scholar
Bickel, M. W. (2017). A new approach to semantic sustainability assessment: Text mining via network analysis revealing transition patterns in German municipal climate action plans. Energy, Sustainability and Society, 7(1), 641. https://doi.org/10.1186/s13705-017-0125-0
Article Google Scholar
Bickel, M. W. (2019a). Reflecting trends in the academic landscape of sustainable energy using probabilistic topic modeling. Energy, Sustainability and Society, 9(1).
Google Scholar
Bickel, M. W. (27 Januar 2019b). textility – An R package for applied text mining with an example of topic modellling in the field of research on sustainable energy. Zenodo. https://doi.org/10.5281/zenodo.2550719.
Blake, C. (2011). Text mining. Annual Review of Information Science and Technology, 45(1), 121–155. https://doi.org/10.1002/aris.2011.1440450110.
Article Google Scholar
Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. https://doi.org/10.1145/2133806.2133826.
Article Google Scholar
Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. In W. Cohen & A. Moore (Hrsg.), Proceedings of the 23rd international conference on Machine learning – ICML ’06 (S. 113–120). New York: ACM Press. https://doi.org/10.1145/1143844.1143859.
Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 1(1), 17–35. https://doi.org/10.1214/07-AOAS114.
Article Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022.
Google Scholar
BMBF. (2017). Bildung und Forschung in Zahlen 2017. Bundesministerium für Bildung und Forschung; Referat Statistik, Internationale Vergleichsanalysen. Bonn. www.datenportal.bmbf.de.
Bouchet-Valat, M. (2014). SnowballC: Snowball stemmers based on the C libstemmer UTF-8 library. https://CRAN.R-project.org/package=SnowballC.
Brill, E. (1995). Transformation-based-error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics, 21(4), 543–565.
Google Scholar
Chang, J., Gerrish, S., Wang, C., Boydgraber, J. L., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. Advances in neural information processing systems, 288–296.
Google Scholar
Chen, J., Li, K., Zhu, J., & Chen, W. (2016). WarpLDA: A cache efficient O(1) algorithm for latent dirichlet allocation. Proceedings of the VLDB Endowment, 9(10): 744–755. https://doi.org/10.14778/2977797.2977801.
Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems, 1695. http://igraph.org.
DiMaggio, P., Nag, M., & Blei, D. (2013). Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of US government arts funding. Poetics, 41(6), 570–606. https://doi.org/10.1016/j.poetic.2013.08.004.
Article Google Scholar
Döbel, Inga, Leis, Miriam, Vogelsang, Manuel Molina, Neustroev, Dmitry, Petzka, Henning, Rüping, Stefan, … Welz, Juliane. (2018). Maschinelles Lernen – Kompetenzen, Anwendungen und Forschungsbedarf. Sankt Augustin: Fraunhofer-Gesellschaft (IAS, IMW, Zentrale).
Google Scholar
Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI Magazine, 17(3), 37–53.
Google Scholar
GCoM. (1 Januar 2018). [InternetDocument]. https://www.globalcovenantofmayors.org/.
Google. (o. J.). Environmental Insights Explorer. https://insights.sustainability.google/. Zugegriffen: 11. Jan. 2019.
Gower, J. C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 53(3–4), 325–338. https://doi.org/10.1093/biomet/53.3-4.325.
Article Google Scholar
Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1), 5228–5235. https://doi.org/10.1073/pnas.0307752101.
Article CAS Google Scholar
Hecker, D., Döbel, I., P., Petersen, U., Rauschert, A., Schmitz, V., & Voss, A. (2017). Zukunftsmarkt Künstliche Intelligenz – Potenziale und Anwendungen (S. 64). Sankt Augustin: Fraunhofer-Allianz Big Data.
Google Scholar
Heinrichs, H., & Michelsen, G. (Hrsg.). (2014). Nachhaltigkeitswissenschaften. Berlin: Springer. https://doi.org/10.1007/978-3-642-25112-2.
Hofmann, T. (1999). Probabilistic latent semantic analysis. In K. B. Laskey (Hrsg.), Uncertainty in artificial intelligence: Proceedings of the fifteenth conference (1999), July 30–August 1, 1999, Royal Institute of Technology (KTH), Stockholm, Sweden (S. 289–296). San Francisco: Kaufmann.
Google Scholar
Hotho, A., Nürnberger, A., & Paaß, G. (2005). A brief survey text mining. Ldv Forum, 20(1), 19–62.
Google Scholar
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79–86.
Article Google Scholar
Mardia, K. V. (1978). Some properties of classical multi-dimesional scaling. Communications in Statistics – Theory and Methods, 7(13), 1233–1241. https://doi.org/10.1080/03610927808827707.
Article Google Scholar
Murugesan, S. (2008). Harnessing green IT: Principles and practices. IT Professional, 10(1), 24–33. https://doi.org/10.1109/MITP.2008.10.
Article Google Scholar
Porter, M. F. (1980). An algorithm for suffix stripping. Program: Electronic Library and Information Systems, 14(3), 130–137. https://doi.org/10.1108/eb046814.
Article Google Scholar
R Core Team. (2019). R: A language and environment for statistical computing. https://www.R-project.org/.
Rao, C. R. (1982). Diversity and dissimilarity coefficients: A unified approach. Theoretical Population Biology, 21(1), 24–43. https://doi.org/10.1016/0040-5809(82)90004-1.
Article Google Scholar
Reuter, M. A. (2016). Digitalizing the circular economy: Circular economy engineering defined by the metallurgical internet of things. Metallurgical and Materials Transactions B, 47(6), 3194–3220. https://doi.org/10.1007/s11663-016-0735-5.
Article CAS Google Scholar
Rockström, J., Steffen, W., Noone, K., Persson, Å, Stuart Chapin, F., Lambin, E. F., et al. (2009). A safe operating space for humanity. Nature, 461(7263), 472–475. https://doi.org/10.1038/461472a.
Article CAS Google Scholar
Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. In X. Cheng, H. Li, E. Gabrilovich, & J. Tang (Hrsg.), Proceedings of the eighth ACM International Conference on Web Search and Data Mining – WSDM ’15 (S. 399–408). New York: ACM Press. https://doi.org/10.1145/2684822.2685324.
Schmidt, B. M. (2012). Words alone: Dismantling topic models in the humanities. Journal of Digital Humanities, 2(1), 49–65.
Google Scholar
Selivanov, D., & Wang, Q. (2017). text2vec: Modern text mining framework for R. https://CRAN.R-project.org/package=text2vec.
Sievert, C., & Shirley, K. (2015). LDAvis: Interactive visualization of topic models. https://CRAN.R-project.org/package=LDAvis.
Stiftung zur Förderung der Hochschulrektorenkonferenz. (o. J.). https://www.forschungslandkarte.de/landkarte.html. Zugegriffen: 3. Jan. 2019.
Thanopoulos, A., Fakotakis, N., & Kokkinakis, G. (2002). Comparative evaluation of collocation extraction metrics. LREC, 2, 620–625.
Google Scholar
Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845.
Article Google Scholar
Weigel, P., & Fischedick, M. (2018). Rolle der Digitalisierung in der soziotechnischen Transformation des Energiesystems. Energiewirtschaftliche Tagesfragen, 68(5), 10–16.
Google Scholar
Wickham, H. (2016). rvest: Easily harvest (scrape) web pages. https://CRAN.R-project.org/package=rvest.
Wilts, H., & Berg, H. (April 2017). Digitale Kreislaufwirtschaft. Wuppertaler Impulse zur Nachhaltigkeit. https://epub.wupperinst.org/frontdoor/deliver/index/docId/6977/file/6977_Wilts.pdf.

Download references

Author information

Authors and Affiliations

Wuppertal Institut für Klima, Umwelt, Energie gGmbH, Wuppertal, Deutschland
Manuel W. Bickel & Christa Liedtke

Authors

Manuel W. Bickel
View author publications
You can also search for this author in PubMed Google Scholar
Christa Liedtke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel W. Bickel .

Editor information

Editors and Affiliations

Fakultät Life Sciences, Hochschule für Angewandte Wissenschaften Hamburg, Hamburg, Germany
Walter Leal Filho

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bickel, M.W., Liedtke, C. (2021). Angewandtes Text Mining im Kontext der Nachhaltigkeitsforschung am Beispiel der deutschen Forschungslandkarte der Hochschulrektorenkonferenz. In: Leal Filho, W. (eds) Digitalisierung und Nachhaltigkeit. Theorie und Praxis der Nachhaltigkeit. Springer Spektrum, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-61534-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-662-61534-8_8
Published: 10 November 2020
Publisher Name: Springer Spektrum, Berlin, Heidelberg
Print ISBN: 978-3-662-61533-1
Online ISBN: 978-3-662-61534-8
eBook Packages: Life Science and Basic Disciplines (German Language)

Publish with us

Policies and ethics

Angewandtes Text Mining im Kontext der Nachhaltigkeitsforschung am Beispiel der deutschen Forschungslandkarte der Hochschulrektorenkonferenz

Zusammenfassung

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Text Mining sozioökonomischer Lehrbuchinhalte

Text Mining in der Organisationsforschung

Einleitung Text Mining in den Sozialwissenschaften

Literatur

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Angewandtes Text Mining im Kontext der Nachhaltigkeitsforschung am Beispiel der deutschen Forschungslandkarte der Hochschulrektorenkonferenz

Zusammenfassung

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Text Mining sozioökonomischer Lehrbuchinhalte

Text Mining in der Organisationsforschung

Einleitung Text Mining in den Sozialwissenschaften

Literatur

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation