Application of entity linking to identify research fronts and trends

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Studying research fronts enables researchers to understand how their academic fields emerged, how they are currently developing and their changes over time. While topic modelling tools help discover themes in documents, they employ a “bag-of-words” approach and require researchers to manually label categories, specify the number of topics a priori, and make assumptions about word distributions in documents. This paper proposes an alternative approach based on entity linking, which links word strings to entities from a knowledge base, to help solve issues associated with “bag-of-words” approaches by automatically identifying topics based on entity mentions. To study topic trends and popularity, we use four indicators—Mann–Kendall’s test, Sen’s slope analysis, z-score values and Kleinberg’s burst detection algorithm. The combination of these indicators helps us understand which topics are particularly active (“hot” topics), which are decreasing (“cold” topics or past “bursty” topics) and which are maturely developed. We apply the approach and indicators to the fields of Information Science and Accounting.

LanguageEnglish
Pages357-379
Number of pages23
JournalScientometrics
Volume122
Issue number1
Early online date1 Nov 2019
DOIs
Publication statusPublished - 1 Jan 2020

Fingerprint

Information science
Labels
trend
information science
popularity
Values

Bibliographical note

Copyright the Author(s) 2019. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • Burstiness
  • Content Analysis and Indexing
  • Entity annotation
  • Information Storage and Retrieval
  • Natural Language Processing
  • Text analysis

Cite this

@article{cdffa1b3a8e1460aa166e609830861c6,
title = "Application of entity linking to identify research fronts and trends",
abstract = "Studying research fronts enables researchers to understand how their academic fields emerged, how they are currently developing and their changes over time. While topic modelling tools help discover themes in documents, they employ a “bag-of-words” approach and require researchers to manually label categories, specify the number of topics a priori, and make assumptions about word distributions in documents. This paper proposes an alternative approach based on entity linking, which links word strings to entities from a knowledge base, to help solve issues associated with “bag-of-words” approaches by automatically identifying topics based on entity mentions. To study topic trends and popularity, we use four indicators—Mann–Kendall’s test, Sen’s slope analysis, z-score values and Kleinberg’s burst detection algorithm. The combination of these indicators helps us understand which topics are particularly active (“hot” topics), which are decreasing (“cold” topics or past “bursty” topics) and which are maturely developed. We apply the approach and indicators to the fields of Information Science and Accounting.",
keywords = "Burstiness, Content Analysis and Indexing, Entity annotation, Information Storage and Retrieval, Natural Language Processing, Text analysis",
author = "Mauricio Marrone",
note = "Copyright the Author(s) 2019. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.",
year = "2020",
month = "1",
day = "1",
doi = "10.1007/s11192-019-03274-x",
language = "English",
volume = "122",
pages = "357--379",
journal = "Scientometrics",
issn = "0138-9130",
publisher = "Springer, Springer Nature",
number = "1",

}

Application of entity linking to identify research fronts and trends. / Marrone, Mauricio.

In: Scientometrics, Vol. 122, No. 1, 01.01.2020, p. 357-379.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Application of entity linking to identify research fronts and trends

AU - Marrone, Mauricio

N1 - Copyright the Author(s) 2019. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

PY - 2020/1/1

Y1 - 2020/1/1

N2 - Studying research fronts enables researchers to understand how their academic fields emerged, how they are currently developing and their changes over time. While topic modelling tools help discover themes in documents, they employ a “bag-of-words” approach and require researchers to manually label categories, specify the number of topics a priori, and make assumptions about word distributions in documents. This paper proposes an alternative approach based on entity linking, which links word strings to entities from a knowledge base, to help solve issues associated with “bag-of-words” approaches by automatically identifying topics based on entity mentions. To study topic trends and popularity, we use four indicators—Mann–Kendall’s test, Sen’s slope analysis, z-score values and Kleinberg’s burst detection algorithm. The combination of these indicators helps us understand which topics are particularly active (“hot” topics), which are decreasing (“cold” topics or past “bursty” topics) and which are maturely developed. We apply the approach and indicators to the fields of Information Science and Accounting.

AB - Studying research fronts enables researchers to understand how their academic fields emerged, how they are currently developing and their changes over time. While topic modelling tools help discover themes in documents, they employ a “bag-of-words” approach and require researchers to manually label categories, specify the number of topics a priori, and make assumptions about word distributions in documents. This paper proposes an alternative approach based on entity linking, which links word strings to entities from a knowledge base, to help solve issues associated with “bag-of-words” approaches by automatically identifying topics based on entity mentions. To study topic trends and popularity, we use four indicators—Mann–Kendall’s test, Sen’s slope analysis, z-score values and Kleinberg’s burst detection algorithm. The combination of these indicators helps us understand which topics are particularly active (“hot” topics), which are decreasing (“cold” topics or past “bursty” topics) and which are maturely developed. We apply the approach and indicators to the fields of Information Science and Accounting.

KW - Burstiness

KW - Content Analysis and Indexing

KW - Entity annotation

KW - Information Storage and Retrieval

KW - Natural Language Processing

KW - Text analysis

UR - http://www.scopus.com/inward/record.url?scp=85074816789&partnerID=8YFLogxK

U2 - 10.1007/s11192-019-03274-x

DO - 10.1007/s11192-019-03274-x

M3 - Article

VL - 122

SP - 357

EP - 379

JO - Scientometrics

T2 - Scientometrics

JF - Scientometrics

SN - 0138-9130

IS - 1

ER -