Translation

“Lucky points” of scientists’ referencing behavior

“Lucky Points” of Scientists’ Referencing Behavior

Jialin Liu, Hongkan Chen, Zhibo Liu, and Yi Bu

Department of Information Management, Peking University, Beijing 100871, China

Reviewers and editors of a paper might ask the author(s) to adjust the references before it is finally published. A reference list formulates the foundation and endorsements of scientific research. As citation-based bibliometric indicators have been widely adopted in research evaluation, the related factors of citation counts attract scientists’ interests. Naturally, an interesting question was put forward: Do references of a paper have the magic to make an influence on its citations?

Over the last decade, the effects of characteristics of references (e.g., the number of references, the age of references, the interdisciplinarity of references, etc.) have been studied systematically by scientometricians. Although different results were found with various methodologies, people reached a consensus that the reference list of a publication indeed affects its citation counts. However, the major limitation of previous works comes from the insufficiency of disciplinary-level analyses. To the best of our knowledge, few nuanced, discipline-wise differences have been observed.

—Do references of a paper have the magic to make an influence on its citations?—

In our recent study (Liu et al., 2022), we dissected a reference list with five distinct dimensions, namely the number of references, the number of citations of references, the age of references, the number of nodes in a reference cascade, and the density of bibliographic coupling networks. It is worth noticing that the latter two indicators are network- instead of count-based, which is quite different from previous scientometric analyses methodologically.

Based on over one million articles published in 2005 covered by Microsoft Academic Graph, we found either inverted-L or -U shapes (non-linearity) for the relations between these characteristics of references and citations of scientific publications with the universal existence of a critical point, i.e., a position indicating phase transitions for each distribution. We named it a “lucky point” in the study and proposed a mathematical definition to calculate the specific occurrence of this critical point. Here, “lucky” hints that, when the value of the indicator is around a certain value, the publication has more chances of obtaining a relatively higher citation.

These non-linear trends are consistent with common sense, i.e., there is a set of “suitable” references relating to a higher citation impact for each publication. For example, for a given research topic, the candidates appropriate as references are relatively fixed. Hence, a too limited number of references would lower the reliability and credibility, while redundant references might counter the normative theory of citations as well. The findings emphasize the importance of looking for “suitable” references. Besides the theories, methodologies, and empirical results, references themselves play a significant role in improving framing, narratives, and storytelling.

The definition of “lucky points” allows us to calculate this indicator for each discipline. We found that “lucky points” in different disciplines vary greatly. By mapping those five “lucky points” to discipline-level indicators, we further explored how the discipline-level differences relate to the characteristics of the discipline itself. We observed a ubiquitous effect of disciplinary academic “environments” on where a lucky point appears. For instance, a larger “lucky point” for the number of references or the number of citations of references is more likely to be observed in disciplines where scientists tend to cite more references (e.g., Biology) or higher-cited references (e.g., Psychology). This finding indicates the phenomenon of “peer effect” in academic circle, i.e., scientists’ performances shape the referencing preference of their discipline.

The original article on which this essay is based is: Liu, J., Chen, H., Liu, Z., Bu, Y., & Gu, W. (2022). Non-linearity between referencing behavior and citation impact: A large-scale, discipline-level analysis. Journal of Informetrics, 16(3), 101318. DOI: https://doi.org/10.1016/j.joi.2022.101318.

Authors

Jialin Liu is an undergraduate student in Data Science at the Department of Information Management, Peking University. He is doing research in scientometrics and science policy. Particularly, his research focuses on the mobility of scientists and its dynamics.

Hongkan Chen is an undergraduate student in Data Science at the Department of Information Management, Peking University. He is studying science paradigm using quantitative methods with massive datasets.

Zhibo Liu is an undergraduate student in Data Science at the Department of Information Management, Peking University. Zhibo’s research aims to understand the social dimensions of the global scientific ecosystem with computational techniques.

Yi Bu is an Assistant Professor in Data Science at the Department of Information Management, Peking University. Yi is particularly focusing on scholarly data mining and computational social sciences.

Cite this article in APA as: Liu, J., Chen, H., Liu, Z., & Bu, Y. (2022, August 25). “Lucky points” of scientists’ referencing behavior. Information Matters, Vol. 2, Issue 8. https://informationmatters.org/2022/08/lucky-points-of-scientists-referencing-behavior/

Yi Bu

I am doing research in the application aspect of big data analytics, with a particular focus on scholarly data mining. Specifically, my research endeavors to elucidate the process of knowledge diffusion (e.g., differences between knowledge diffusion of interdisciplinary and unidisciplinary publications), the analysis of scholarly networks and their variants (e.g., co-citation, bibliographic coupling, and some hybrid networks), and bibliometric indicators for research assessment (e.g., citation-based impact indicators). I aim to understand the social dimensions of the global scientific ecosystem by leveraging massive datasets, computational techniques, and social theories.