Number of the records: 1  

Topic modeling and classification of scientific disciplines

  1. 1.
    SYSNO ASEP0566673
    Document TypeA - Abstract
    R&D Document TypeO - Ostatní
    TitleTopic modeling and classification of scientific disciplines
    Author(s) Hladík, Radim (FLU-F) ORCID, RID, SAI
    Renisio, Y. (FR)
    ActionInternational Conference on Science and Technology Indicators (STI 2022). From Global Indicators to Local Applications /26./
    Event date07.09.2022 - 09.09.2022
    VEvent locationGranada
    CountryES - Spain
    Event typeWRD
    Languageeng - English
    Keywordstopic modeling ; classification ; disciplines ; theses ; science
    Subject RIVAF - Documentation, Librarianship, Information Studies
    OECD categoryInformation science (social aspects)
    R&D ProjectsGJ20-01752Y GA ČR - Czech Science Foundation (CSF)
    Institutional supportFLU-F - RVO:67985955
    AnnotationThis paper evaluates the possibility of classifying Ph.D. theses into disciplines by using a bottom-up empirical approach based on topic modeling. It examines a dataset of 334810 Ph.D. theses submitted at French universities between 2006 and 2020. In this comprehensive dataset, the variable “discipline” does not rely on any controlled vocabulary or disciplinary ontology. Consequently, there are 23057 unique labels for the variable of which 14538 appear only once. Such situation renders impossible any full-scale analysis of the data from the perspective of scientific disciplines. Our topic model is built atop of abstracts of 285311 of theses in French that include a title, keywords, and abstract. After applying the TopSBM algorithm, we obtained a topic model with 7 levels of hierarchy. The outcomes of our experiments with classification of theses into disciplines suggest that topics derived from purely textual data implicitly capture information about disciplines. This quality of topic modelling can be of great benefit when dealing with datasets where disciplinary information is unavailable or unreliable and where citation records are absent (as it remains the case especially in the Humanities).
    WorkplaceInstitute of Philosophy
    ContactChlumská Simona, chlumska@flu.cas.cz ; Tichá Zuzana, asep@flu.cas.cz Tel: 221 183 360
    Year of Publishing2023
    Electronic addresshttps://doi.org/10.5281/zenodo.6957149
Number of the records: 1  

  This site uses cookies to make them easier to browse. Learn more about how we use cookies.